nvidia.dali.fn.box_encoder#

nvidia.dali.fn.box_encoder(__input_0, __input_1, /, *, anchors, bytes_per_sample_hint=[0], criteria=0.5, means=[0.0, 0.0, 0.0, 0.0], offset=False, preserve=False, scale=1.0, stds=[1.0, 1.0, 1.0, 1.0], device=None, name=None)#

使用作为参数传递的一组默认框（锚点）对输入边界框和标签进行编码。

此操作符遵循“SSD: Single Shot MultiBox Detector”中描述并在 mlperf/training 中实现的算法。输入必须作为以下张量提供

BBoxes，其中包含表示为 [l,t,r,b] 的边界框。
Labels，其中包含每个边界框的相应标签。

结果是两个张量

EncodedBBoxes，其中包含 M 编码的边界框，格式为 [l,t,r,b]，其中 M 是锚点的数量。
EncodedLabels，其中包含每个编码框的相应标签。

支持的后端

‘cpu’
‘gpu’

参数:

__input_0¶ (TensorList) – 操作符的输入。
__input_1¶ (TensorList) – 操作符的输入。

关键字参数:

anchors¶ (float 或 list of float) – 用于编码的锚点，浮点数列表采用 ltrb 格式。
bytes_per_sample_hint¶ (int 或 list of int, optional, default = [0]) –
每个样本的输出大小提示（以字节为单位）。

如果指定，则将在 GPU 或页锁定主机内存中预分配操作符的输出，以容纳此大小的样本批次。
criteria¶ (float, optional, default = 0.5) –
用于将边界框与锚点匹配的 IoU 阈值。

该值需要在 0 到 1 之间。
means¶ (float 或 list of float, optional, default = [0.0, 0.0, 0.0, 0.0]) – 用于归一化的 [x y w h] 均值。
offset¶ (bool, optional, default = False) – 在 EncodedBBoxes 中返回归一化的偏移量 ((encoded_bboxes*scale - anchors*scale) - mean) / stds，它使用 std 和 mean 以及 scale 参数。
preserve¶ (bool, optional, default = False) – 即使操作符的输出未使用，也防止将其从图中删除。
scale¶ (float, optional, default = 1.0) – 在计算偏移量之前重新缩放框和锚点值（例如，返回到绝对值）。
stds¶ (float 或 list of float, optional, default = [1.0, 1.0, 1.0, 1.0]) – 用于偏移归一化的 [x y w h] 标准差。