nvidia.dali.fn.experimental.decoders.image_slice#

nvidia.dali.fn.experimental.decoders.image_slice(__data, __anchor=None, __shape=None, /, *, adjust_orientation=True, affine=True, axes=[1, 0], axis_names='WH', bytes_per_sample_hint=[0], device_memory_padding=16777216, device_memory_padding_jpeg2k=0, dtype=DALIDataType.UINT8, end=None, host_memory_padding=8388608, host_memory_padding_jpeg2k=0, hw_decoder_load=0.9, hybrid_huffman_threshold=1000000, jpeg_fancy_upsampling=False, normalized_anchor=True, normalized_shape=True, output_type=DALIImageType.RGB, preallocate_height_hint=0, preallocate_width_hint=0, preserve=False, rel_end=None, rel_shape=None, rel_start=None, shape=None, start=None, use_fast_idct=False, device=None, name=None)#

Decodes images and extracts regions of interest.

Supported formats: JPEG, JPEG 2000, TIFF, PNG, BMP, PNM, PPM, PGM, PBM, WebP.

The output of the decoder is in HWC layout.

The implementation uses NVIDIA nvImageCodec to decode images.

The slice can be specified by proving the start and end coordinates, or start coordinates and shape of the slice. Both coordinates and shapes can be provided in absolute or relative terms.

The slice arguments can be specified by the following named arguments

  1. start: Slice start coordinates (absolute)

  2. rel_start: Slice start coordinates (relative)

  3. end: Slice end coordinates (absolute)

  4. rel_end: Slice end coordinates (relative)

  5. shape: Slice shape (absolute)

  6. rel_shape: Slice shape (relative)

The slice can be configured by providing start and end coordinates or start and shape. Relative and absolute arguments can be mixed (for example, rel_start can be used with shape) as long as start and shape or end are uniquely defined.

Alternatively, two extra positional inputs can be provided, specifying __anchor and __shape. When using positional inputs, two extra boolean arguments normalized_anchor/normalized_shape can be used to specify the nature of the arguments provided. Using positional inputs for anchor and shape is incompatible with the named arguments specified above.

The slice arguments should provide as many dimensions as specified by the axis_names or axes arguments.

By default, the nvidia.dali.fn.decoders.image_slice() operator uses normalized coordinates and “WH” order for the slice arguments.

When possible, the operator uses the ROI decoding, reducing the decoding time and memory consumption.

Note

GPU accelerated decoding is only available for a subset of the image formats (JPEG, and JPEG2000). For other formats, a CPU based decoder is used. For JPEG, a dedicated HW decoder will be used when available.

Note

WebP decoding currently only supports the simple file format (lossy and lossless compression). For details on the different WebP file formats, see https://developers.google.com/speed/webp/docs/riff_container

Supported backends
  • ‘cpu’

  • ‘mixed’

Parameters:
  • __data (TensorList) – Batch that contains the input data.

  • __anchor (1D TensorList of float or int, optional) –

    Input that contains normalized or absolute coordinates for the starting point of the slice (x0, x1, x2, …).

    Integer coordinates are interpreted as absolute coordinates, while float coordinates can be interpreted as absolute or relative coordinates, depending on the value of normalized_anchor.

  • __shape (1D TensorList of float or int, optional) –

    Input that contains normalized or absolute coordinates for the dimensions of the slice (s0, s1, s2, …).

    Integer coordinates are interpreted as absolute coordinates, while float coordinates can be interpreted as absolute or relative coordinates, depending on the value of normalized_shape.

Keyword Arguments:
  • adjust_orientation (bool, optional, default = True) – Use EXIF orientation metadata to rectify the images

  • affine (bool, optional, default = True) –

    Applies only to the mixed backend type.

    If set to True, each thread in the internal thread pool will be tied to a specific CPU core. Otherwise, the threads can be reassigned to any CPU core by the operating system.

  • axes (int or list of int or TensorList of int, optional, default = [1, 0]) –

    Order of dimensions used for the anchor and shape slice inputs as dimension indices.

    Negative values are interpreted as counting dimensions from the back. Valid range: [-ndim, ndim-1], where ndim is the number of dimensions in the input data.

  • axis_names (layout str, optional, default = ‘WH’) –

    Order of the dimensions used for the anchor and shape slice inputs, as described in layout.

    If a value is provided, axis_names will have a higher priority than axes.

  • bytes_per_sample_hint (int or list of int, optional, default = [0]) –

    Output size hint, in bytes per sample.

    If specified, the operator’s outputs residing in GPU or page-locked host memory will be preallocated to accommodate a batch of samples of this size.

  • device_memory_padding (int, optional, default = 16777216) –

    Applies only to the mixed backend type.

    The padding for nvJPEG’s device memory allocations, in bytes. This parameter helps to avoid reallocation in nvJPEG when a larger image is encountered, and the internal buffer needs to be reallocated to decode the image.

    If a value greater than 0 is provided, the operator preallocates one device buffer of the requested size per thread. If the value is correctly selected, no additional allocations will occur during the pipeline execution.

  • device_memory_padding_jpeg2k (int, optional, default = 0) –

    Applies only to the mixed backend type.

    The padding for nvJPEG2k’s device memory allocations, in bytes. This parameter helps to avoid reallocation in nvJPEG2k when a larger image is encountered, and the internal buffer needs to be reallocated to decode the image.

    If a value greater than 0 is provided, the operator preallocates the necessary number of buffers according to the hint provided. If the value is correctly selected, no additional allocations will occur during the pipeline execution.

  • dtype (nvidia.dali.types.DALIDataType, optional, default = DALIDataType.UINT8) –

    Output data type of the image.

    Values will be converted to the dynamic range of the requested type.

  • end (int or list of int or TensorList of int, optional) –

    End coordinates of the slice.

    Note: Providing named arguments start, end, shape, rel_start, rel_end, rel_shape is incompatible with providing positional inputs anchor and shape.

  • host_memory_padding (int, optional, default = 8388608) –

    Applies only to the mixed backend type.

    The padding for nvJPEG’s host memory allocations, in bytes. This parameter helps to prevent the reallocation in nvJPEG when a larger image is encountered, and the internal buffer needs to be reallocated to decode the image.

    如果提供大于 0 的值,则运算符会为每个线程预分配两个(由于双缓冲)请求大小的主机固定缓冲区。如果选择正确,则在管道执行期间不会发生额外的分配。

  • host_memory_padding_jpeg2k (int,可选,默认值 = 0) –

    Applies only to the mixed backend type.

    nvJPEG2k 主机内存分配的填充大小,以字节为单位。此参数有助于防止在遇到更大的图像时 nvJPEG2k 中发生重新分配,并且需要重新分配内部缓冲区以解码图像。

    If a value greater than 0 is provided, the operator preallocates the necessary number of buffers according to the hint provided. If the value is correctly selected, no additional allocations will occur during the pipeline execution.

  • hw_decoder_load (float,可选,默认值 = 0.9) –

    由硬件 JPEG 解码器处理的图像数据百分比。

    适用于 NVIDIA Ampere GPU 架构中的 mixed 后端类型。

    确定将卸载到硬件解码器的工作负载百分比(如果可用)。最佳工作负载取决于提供给 DALI 管道的线程数,应通过经验找到。更多详细信息请参见 https://developer.nvidia.com/blog/loading-data-fast-with-dali-and-new-jpeg-decoder-in-a100

  • hybrid_huffman_threshold (int,可选,默认值 = 1000000) –

    Applies only to the mixed backend type.

    像素总数(height * width)高于此阈值的图像将使用 nvJPEG 混合 Huffman 解码器。像素较少的图像将使用 nvJPEG 主机端 Huffman 解码器。

    Note

    混合 Huffman 解码器仍然主要使用 CPU。

  • jpeg_fancy_upsampling (bool,可选,默认值 = False) –

    使 mixed 后端使用与 cpu 后端相同的色度上采样方法。

    此选项对应于 libjpegturbo 或 ImageMagick 中提供的 JPEG 精细上采样

  • memory_stats

normalized_anchorbool,可选,默认值 = True

确定是否应将锚点位置输入解释为归一化(范围 [0.0, 1.0])或绝对坐标。

Note

此参数仅在锚点数据类型为 float 时相关。对于整数类型,坐标始终是绝对坐标。

normalized_shapebool,可选,默认值 = True

确定是否应将形状位置输入解释为归一化(范围 [0.0, 1.0])或绝对坐标。

Note

此参数仅在锚点数据类型为 float 时相关。对于整数类型,坐标始终是绝对坐标。

output_typenvidia.dali.types.DALIImageType,可选,默认值 = DALIImageType.RGB

输出图像的色彩空间。

注意:当解码为 YCbCr 时,图像将先解码为 RGB,然后再转换为 YCbCr,遵循 ITU-R BT.601 中的 YCbCr 定义。

preallocate_height_hintint,可选,默认值 = 0

图像宽度提示。

适用于 NVIDIA Ampere GPU 架构中的 mixed 后端类型。

此提示用于为硬件 JPEG 解码器预分配内存。

preallocate_width_hintint,可选,默认值 = 0

图像宽度提示。

适用于 NVIDIA Ampere GPU 架构中的 mixed 后端类型。

此提示用于为硬件 JPEG 解码器预分配内存。

preservebool,可选,默认值 = False

即使运算符的输出未使用,也阻止从图中删除该运算符。

rel_endfloat 或 float 列表或 float 的 TensorList,可选

切片的相对结束坐标(范围 [0.0 - 1.0])。

Note: Providing named arguments start, end, shape, rel_start, rel_end, rel_shape is incompatible with providing positional inputs anchor and shape.

rel_shapefloat 或 float 列表或 float 的 TensorList,可选

切片的相对形状(范围 [0.0 - 1.0])。

提供命名参数 startendshaperel_startrel_endrel_shape 与提供位置输入 anchor 和 shape 不兼容。

rel_startfloat 或 float 列表或 float 的 TensorList,可选

切片的相对起始坐标(范围 [0.0 - 1.0])。

Note: Providing named arguments start, end, shape, rel_start, rel_end, rel_shape is incompatible with providing positional inputs anchor and shape.

shapeint 或 int 列表或 int 的 TensorList,可选

切片的形状。

提供命名参数 startendshaperel_startrel_endrel_shape 与提供位置输入 anchor 和 shape 不兼容。

split_stages : bool,可选,默认值 = False

警告

参数 split_stages 现已弃用,不建议使用。

startint 或 int 列表或 int 的 TensorList,可选

切片的起始坐标。

注意:提供命名参数 start/endstart/shape 与提供位置输入 anchor 和 shape 不兼容。

use_chunk_allocator : bool,可选,默认值 = False

警告

参数 use_chunk_allocator 现已弃用,不建议使用。

use_fast_idctbool,可选,默认值 = False

在基于 libjpeg-turbo 的 CPU 解码器中启用快速 IDCT,当 device 设置为 “cpu” 或设置为 “mixed” 但特定图像无法由 GPU 实现处理时使用。

根据 libjpeg-turbo 文档,解压缩性能最多可提高 14%,而质量几乎没有降低。