读取存储为图像的视频帧#

本示例的目标是展示如何使用 readers.sequence 操作符,该操作符用于读取存储为单独图像的帧序列(视频)。

准备数据#

在本示例中,我们将从视频文件中提取一些帧,并将它们存储为 PNG 格式。我们可以使用 DALI 的 readers.video 来提取帧,并将它们存储在 readers.sequence 所期望的目录结构中。

让我们首先导入必要的模块,并定义稍后我们将使用的路径和常量。

注意DALI_EXTRA_PATH 环境变量应指向从 DALI extra 仓库 下载数据的位置。请确保已检出正确的发布标签。

[7]:
import os.path
import numpy as np
import shutil
from PIL import Image
from nvidia.dali import pipeline_def
import nvidia.dali.fn as fn
import nvidia.dali.types as types

dali_extra_path = os.environ["DALI_EXTRA_PATH"]
video_filename = os.path.join(
    dali_extra_path, "db/optical_flow/sintel_trailer/sintel_trailer_short.mp4"
)
data_dir = "sequence_reader/samples"
batch_size = 1
sequence_length = 10
initial_prefetch_size = 16
n_iter = 10

现在我们准备定义 DALI pipeline,它将读取视频文件并提取帧序列。

[8]:
@pipeline_def
def video_pipe(filenames):
    video = fn.readers.video(
        device="gpu",
        filenames=filenames,
        sequence_length=sequence_length,
        initial_fill=initial_prefetch_size,
    )
    return video


pipe = video_pipe(
    filenames=video_filename, batch_size=batch_size, num_threads=2, device_id=0
)
pipe.build()

最后一步是运行 pipeline 的几次迭代,以生成几个帧序列,并将其存储为 PNG 图像。

[9]:
def save_images(frames, seq_len, directory):
    for j in range(seq_len):
        im = Image.fromarray(frames[j])
        im.save(os.path.join(directory, str(j)) + ".png")


if os.path.exists(data_dir):
    shutil.rmtree(data_dir)
    os.makedirs(data_dir)

for i in range(n_iter):
    pipe_out = pipe.run()
    frames = np.array(pipe_out[0][0].as_cpu())
    label_dir = os.path.join(data_dir, str(i))
    os.makedirs(label_dir)
    save_images(frames, sequence_length, label_dir)

帧序列读取器#

现在我们可以使用 readers.sequence 从我们之前生成的目录加载帧序列。

[10]:
@pipeline_def
def frame_seq_pipe(path):
    video = fn.readers.sequence(file_root=path, sequence_length=sequence_length)
    return video

并定义一个使用 matplotlib 可视化结果的函数

[11]:
%matplotlib inline
from matplotlib import pyplot as plt
import matplotlib.gridspec as gridspec


def show_sequence(sequence):
    columns = 5
    rows = (sequence_length + 1) // (columns)
    fig = plt.figure(figsize=(32, (16 // columns) * rows))
    gs = gridspec.GridSpec(rows, columns)
    for j in range(rows * columns):
        plt.subplot(gs[j])
        plt.axis("off")
        plt.imshow(sequence[j])

最后,我们可以构建并运行 pipeline,并展示一些生成的帧序列。

[12]:
pipe = frame_seq_pipe(
    path=data_dir, batch_size=batch_size, num_threads=1, device_id=0
)
pipe.build()

for i in range(n_iter):
    pipe_out = pipe.run()
    sequences_out = np.array(pipe_out[0][0])
    print(f"Iteration {i} shape: {sequences_out.shape}")
    show_sequence(sequences_out)
Iteration 0 shape: (10, 720, 1280, 3)
Iteration 1 shape: (10, 720, 1280, 3)
Iteration 2 shape: (10, 720, 1280, 3)
Iteration 3 shape: (10, 720, 1280, 3)
Iteration 4 shape: (10, 720, 1280, 3)
Iteration 5 shape: (10, 720, 1280, 3)
Iteration 6 shape: (10, 720, 1280, 3)
Iteration 7 shape: (10, 720, 1280, 3)
Iteration 8 shape: (10, 720, 1280, 3)
Iteration 9 shape: (10, 720, 1280, 3)
../../_images/examples_sequence_processing_sequence_reader_simple_example_13_1.png
../../_images/examples_sequence_processing_sequence_reader_simple_example_13_2.png
../../_images/examples_sequence_processing_sequence_reader_simple_example_13_3.png
../../_images/examples_sequence_processing_sequence_reader_simple_example_13_4.png
../../_images/examples_sequence_processing_sequence_reader_simple_example_13_5.png
../../_images/examples_sequence_processing_sequence_reader_simple_example_13_6.png
../../_images/examples_sequence_processing_sequence_reader_simple_example_13_7.png
../../_images/examples_sequence_processing_sequence_reader_simple_example_13_8.png
../../_images/examples_sequence_processing_sequence_reader_simple_example_13_9.png
../../_images/examples_sequence_processing_sequence_reader_simple_example_13_10.png