应用结构 - NVIDIA 文档

Holoscan 应用程序通过指定操作符序列构建。将一个操作符的输出连接到另一个操作符的输入（通过 add_flow API）配置 Holoscan 的pipeline，并指定各个操作符何时可以运行。

Holoscan 传感器桥通过提供在 Holoscan 应用程序中发送和接收数据的操作符和对象来利用此框架。还有其他操作符用于将特定于应用程序的数据（例如 CSI-2 格式的视频数据）转换为其他标准 Holoscan 操作符可接受的输入格式。要了解传感器桥应用程序的工作原理，我们将逐步介绍 IMX274 播放器示例。

imx274_player

examples/imx274_player.py 中的应用程序配置了以下 pipeline。当 pipeline 的循环完成时，执行会在顶部重新开始，在那里获取和处理新数据。

%%{init: {"theme": "base", "themeVariables": { }} }%% graph r[RoceReceiverOp] --> c[CsiToBayerOp] c --> i[ImageProcessorOp] i --> d[BayerDemosaicOp] d --> v[HolovizOp]

图 1 IMX274 播放器

RoceReceiverOp 在收到帧结束 UDP 消息时唤醒。完成时，接收到的帧数据在 GPU 内存中可用，同时还有发布到应用层的元数据。Holoscan 传感器桥使用 RoCE v2 通过 UDP 传输数据平面流量；这就是接收器被称为 RoceReceiverOp 的原因。
CsiToBayerOp 知道接收到的数据是 CSI-2 RAW10 图像，它将其转换为拜耳视频帧。此图像中的每个像素颜色分量都被解码并存储为 uint16 值。有关 RAW10 的更多信息，请参阅 MIPI CSI-2 规范。
ImageProcessorOp 调整接收到的拜耳图像的颜色和亮度，使其适合显示。
BayerDemosaicOp 将拜耳图像数据转换为 RGBA。
HolovizOp 在 GUI 上显示 RGBA 图像。

对于 pipeline 中的每个步骤，图像数据都存储在 GPU 内存中的缓冲区中。指向该数据的指针在 pipeline 中的每个元素之间传递，避免了主机和 GPU 内存之间昂贵的内存复制。GPU 加速用于执行每个操作符的功能，从而实现非常低的延迟操作。

Python imx274_player.py 和 C++ imx274_player.cpp 文件以这种方式初始化传感器桥设备、摄像头和 pipeline。为了提高可读性，跳过了一些细节——请务必查看实际的示例代码以了解更多详细信息。

Python
C++

复制
已复制！

            
            import hololink as hololink_module

def main():
    # Get handles to GPU
    cuda.cuInit(0)
    cu_device_ordinal = 0
    cu_device = cuda.cuDeviceGet(cu_device_ordinal)
    cu_context = cuda.cuDevicePrimaryCtxRetain(cu_device)

    # Look for sensor bridge enumeration messages; return only the one we're looking for
    channel_metadata = hololink_module.Enumerator.find_channel(channel_ip="192.168.0.2")
    # Use that enumeration data to instantiate a data receiver object
    hololink_channel = hololink_module.DataChannel(channel_metadata)

    # Now that we can communicate, create the camera controller
    camera = hololink_module.sensors.imx274.dual_imx274.Imx274Cam(hololink_channel, ...)

    # Set up our Holoscan pipeline
    application = HoloscanApplication(cu_context, cu_device_ordinal, camera, hololink_channel, ...)
    application.config(...)

    # Connect and initialize the sensor bridge device
    hololink = hololink_channel.hololink()
    hololink.start()  # Establish a connection to the sensor bridge device
    hololink.reset()  # Drive the sensor bridge to a known state

    # Configure the camera for 4k at 60 frames per second
    camera_mode = imx274_mode.Imx274_Mode.IMX274_MODE_3840X2160_60FPS
    camera.setup_clock()
    camera.configure(camera_mode)

    # Run our Holoscan pipeline
    application.run()  # we don't usually return from this call.
    hololink.stop()

复制
已复制！

            
            #include <hololink/data_channel.hpp>
#include <hololink/enumerator.hpp>
#include <hololink/hololink.hpp>

int main(int argc, char** argv)
{
  // Get handles to GPU
  cuInit(0);
  int cu_device_ordinal = 0;
  CUdevice cu_device;
  cuDeviceGet(&cu_device, cu_device_ordinal);
  CUcontext cu_context;
  cuCtxCreate(&cu_context, 0, cu_device);

  // Look for sensor bridge enumeration messages; return only the one we're looking for
  hololink::Metadata channel_metadata = hololink::Enumerator::find_channel(hololink_ip);
  // Use that enumeration data to instantiate a data receiver object
  hololink::DataChannel hololink_channel(channel_metadata);

  // Import the IMX274 sensor module and the IMX274 mode
  py::module_ imx274 = py::module_::import("hololink.sensors.imx274");
  py::object Imx274Cam = imx274.attr("dual_imx274").attr("Imx274Cam");

  // Now that we can communicate, create the camera controller
  py::object camera = Imx274Cam("hololink_channel"_a = hololink_channel, ...);

  // Set up our Holoscan pipeline
  auto application = holoscan::make_application<HoloscanApplication>(...)
  application->config(...)

  // Connect and initialize the sensor bridge device
  std::shared_ptr<hololink::Hololink> hololink = hololink_channel.hololink();
  hololink->start(); // Establish a connection to the sensor bridge device
  hololink->reset(); // Drive the sensor bridge to a known state

  // Configure the camera for 4k at 60 frames per second
  camera.attr("setup_clock")();
  camera.attr("configure")(Imx274_Mode(0));

  // Run our Holoscan pipeline
  application->run(); // we don't usually return from this call.
  hololink->stop();
}

重要细节

Enumerator.find_channel 阻塞调用者，直到找到与给定条件匹配的枚举消息。如果未找到匹配的设备，此方法将超时（默认 20 秒）并引发异常。Holoscan 传感器桥枚举消息每秒发送一次。
Holoscan 传感器桥设备为每个数据平面控制器传输枚举消息，目前每个数据平面控制器直接对应于每个传感器桥以太网接口。如果设备上的两个接口都连接到主机，则主机将从同一传感器桥设备接收一对不同的枚举消息，每个数据端口一个。
枚举消息发送到本地广播地址，路由器不允许将这些本地广播消息转发到其他网络。主机和传感器桥设备之间必须有本地连接才能枚举它。
Enumerator.find_channel 返回一个名称/值对字典，其中包含有关正在发现的数据端口的识别信息，包括 MAC ID、IP 地址、设备内所有可编程组件的版本、设备序列号以及此数据端口控制器在设备中的具体实例。虽然 IP 地址可能会更改，但 MAC ID、序列号和数据平面控制器实例是恒定的。主机不需要请求此字典中包含的任何数据；它全部由传感器桥设备广播。
DataChannel 是传感器桥设备上数据平面的本地控制器。它包含用于配置在该数据平面上传输的数据包的目标地址的 API——这由下面描述的接收器操作符使用。
在此示例中，camera 对象提供了应用程序层可以访问的大多数 API。当应用程序配置摄像头时，摄像头对象知道如何与各种传感器桥控制器对象协同工作，以正确配置 DataChannel。
通常，单个 Hololink 传感器桥设备上有多个 DataChannel 实例，并且 Hololink 设备上的许多 API 将影响同一设备上的所有 DataChannel 对象。在此示例中，调用 hololink.reset 将重置此设备上的所有数据通道；在立体 IMX274 配置中，调用 camera.setup_clock 会设置两个摄像头之间共享的时钟。因此，应用程序必须谨慎调用 camera.setup_clock——在第一个摄像头运行时重置时钟（例如，在第二个图像传感器上）可能会导致未定义的状态。

在调用 application.run 时，Holoscan 调用应用程序的 compose 方法，其中包括此方法

Python
C++

复制
已复制！

            
            class HoloscanApplication(holoscan.core.Application):
    def __init__(self, ..., camera, hololink_channel, ...):
        ...
        self._camera = camera
        self._hololink_channel = hololink_channel
        ...
    def compose(self):
        ...
        # Create the CSI to bayer converter.
        csi_to_bayer_operator = hololink_module.operators.CsiToBayerOp(...)

        # The call to camera.configure(...) earlier set our image dimensions
        # and bytes per pixel. This call asks the camera to configure the
        # converter accordingly.
        self._camera.configure_converter(csi_to_bayer_operator)

        # csi_to_bayer_operator now knows the image dimensions and bytes per pixel,
        # and can compute the overall size of the received image data.
        frame_size = csi_to_bayer_operator.get_csi_length()

        # Create a receiver object that fills out our frame buffer. The receiver
        # operator knows how to configure hololink_channel to send its data
        # to us and to provide an end-of-frame indication at the right time.
        receiver_operator = hololink_module.operators.RoceReceiverOp(
            hololink_channel,
            frame_size, ...)
        ...
        # Use add_flow to connect the operators together:
        ...
        # receiver_operator.compute() will be followed by csi_to_bayer_operator.compute()
        self.add_flow(receiver_operator, csi_to_bayer_operator, {("output", "input")})
        ...

复制
已复制！

            
            class HoloscanApplication : public holoscan::Application {
public:
    explicit HoloscanApplication(..., py::object camera, hololink::DataChannel& hololink_channel, ...)
        : ...
        , camera_(camera)
        , hololink_channel_(hololink_channel)
        ...
    {
    }

    void compose() override
    {
        ...
        // Create the CSI to bayer converter.
        auto csi_to_bayer_operator = make_operator<hololink::operators::CsiToBayerOp>(...);

        // The call to camera.attr("configure")(...) earlier set our image dimensions
        // and bytes per pixel. This call asks the camera to configure the
        // converter accordingly.
        camera_.attr("configure_converter")(csi_to_bayer_operator);

        // csi_to_bayer_operator now knows the image dimensions and bytes per pixel,
        // and can compute the overall size of the received image data.
        const size_t frame_size = csi_to_bayer_operator->get_csi_length();

        // Create a receiver object that fills out our frame buffer. The receiver
        // operator knows how to configure hololink_channel to send its data
        // to us and to provide an end-of-frame indication at the right time.
        auto receiver_operator = make_operator<hololink::operators::RoceReceiverOp>(
            holoscan::Arg("hololink_channel", &hololink_channel_),
            holoscan::Arg("frame_size", frame_size), ...);
        ...
        // Use add_flow to connect the operators together:
        ...
        // receiver_operator.compute() will be followed by csi_to_bayer_operator.compute()
        add_flow(receiver_operator, csi_to_bayer_operator, { { "output", "input" } });
        ...
    }

private:
    const py::object camera_;
    hololink::DataChannel& hololink_channel_;
};

一些关键点

receiver_operator 不知道它正在处理视频数据。它只被告知要填充的内存区域和数据块的大小。当收到完整的数据块时，CPU 将收到通知，以便 pipeline 处理可以继续。
给定预期的帧大小，接收器缓冲区将分配足够大的 GPU 内存来容纳接收到的数据以及额外的元数据；分配内存的方式符合硬件和后续操作符的要求。
csi_to_bayer_operator 知道 CSI-2 格式化图像数据的内存布局。我们调用 camera.configure_converter 允许摄像头传达图像尺寸和像素深度；有了这些知识，调用 csi_to_bayer_operator.get_csi_length 就可以返回管理这些图像所需的内存块大小。此内存大小不仅包括图像数据本身，还包括 CSI-2 元数据和 GPU 内存对齐要求。由于 CsiToBayerOp 是 GPU 加速功能，因此它可能具有摄像头传感器对象不知道的特殊内存要求。
receiver_operator 与 holoscan_channel 协调以配置传感器桥数据平面。配置自动处理使用我们的主机以太网和 IP 地址、目标内存地址、安全密钥和帧大小信息设置传感器桥设备。
传感器桥设备在 holoscan_channel 对象配置后，将开始将所有接收到的传感器数据转发到配置的接收器。我们尚未指示摄像头开始流式传输数据，但在此时，我们已准备好接收它。
receiver_operator 跟踪 device 参数，在此应用程序中，该参数是我们的摄像头。当调用 receiver_operator.start 时，它将调用 device.start,，在我们的 IMX274 实现中，这将指示摄像头开始流式传输数据。

在此示例中，receiver_operator 是 RoceReceiverOp 实例，它利用 ConnectX 固件中存在的 RDMA 加速功能。使用 RoceReceiverOp，CPU 仅在接收到帧的最后一个数据包时才看到中断——之前发送的所有帧数据都在后台写入 GPU 内存。在没有 ConnectX 设备的系统中，LinuxReceiverOperator 提供相同的功能，但使用主机 CPU 和 Linux 内核来接收入口 UDP 请求；并且 CPU 将有效负载数据写入 GPU 内存。这提供了与 RoceReceiverOp 相同的功能，但性能要低得多。

tao_peoplenet

examples/tao_peoplenet.py 中包含一个演示应用程序，其中推理用于生成视频叠加层。Tao PeopleNet 用于确定实时视频流中人员、包和面部的位置。此示例程序在叠加层上绘制边界框，说明在视频帧中检测到这些对象的位置。

Pipeline 结构

%%{init: {"theme": "base", "themeVariables": { }} }%% graph r[RoceReceiverOp] --> c[CsiToBayerOp] c --> i[ImageProcessorOp] i --> d[BayerDemosaicOp] d --> s[ImageShiftToUint8Operator] s --> p[FormatConverterOp] p --> fi[FormatInferenceInputOp] fi --> in[InferenceOp] in --> pf[PostprocessorOp] s -- live video --> v[HolovizOp] pf -- overlay --> v

图 2 带有推理的 IMX274 播放器

向视频 pipeline 添加推理很容易：只需添加适当的操作符和数据流即可。在我们的例子中，我们使用内置于 HolovizOp 的视频混合器来显示由推理生成的叠加层。RoceReceiverOp 被指定为始终提供最近接收到的视频帧，因此如果 pipeline 花费超过一个帧时间才能完成，则循环的下一次迭代将始终处理最近接收到的视频帧。

body_pose_estimation

人体姿势估计应用程序从实时视频获取输入，使用 YOLOv8 姿势模型执行推理，然后显示叠加在原始视频上的关键点。关键点是

[鼻子, 左眼, 右眼, 左耳, 右耳, 左肩, 右肩, 左肘, 右肘, 左腕, 右腕, 左髋, 右髋, 左膝, 右膝, 左踝, 右踝]

此应用程序的 pipeline 与人体检测应用程序相同

%%{init: {"theme": "base", "themeVariables": { }} }%% graph r[RoceReceiverOp] --> c[CsiToBayerOp] c --> i[ImageProcessorOp] i --> d[BayerDemosaicOp] d --> s[ImageShiftToUint8Operator] s --> p[FormatConverterOp] p --> fi[FormatInferenceInputOp] fi --> in[InferenceOp] in --> pf[PostprocessorOp] s -- live video --> v[HolovizOp] pf -- overlay --> v

图 3 人体姿势估计

不同之处在于 InferenceOp 使用 YOLOv8 姿势模型，并且后处理器操作符具有特定于 YOLOv8 模型的后处理逻辑。具体来说，它获取 Inference 的输出，并滤除得分较低的检测结果，并在将输出发送到 HolovizOp 之前应用非极大值抑制 (nms)。

IMX274 立体实时视频演示

可以实例化多个接收器操作符以支持来自多个摄像头的数据馈送。在 examples/stereo_imx274_player.py 中，呈现了相同的实时视频馈送 pipeline，不同之处在于它被实例化了两次，IMX274 立体摄像头板上的每个摄像头一次。在这种情况下，Holoscan 在每个 pipeline 之间循环，在显示器上提供两个单独的窗口（每个可视化器一个）。每个 receiver_operator 实例都是独立的并且同时运行。

对于只有单个网络连接的系统，Holoscan 传感器桥可以配置为通过同一网络连接传输两个摄像头的数据。HSB 上的 10Gbps 网络端口没有足够的带宽来支持两个 4K 60FPS 视频流，因此支持仅限于 1080p 模式下的摄像头。有关如何配置 HSB 以这种方式工作的示例，请参阅 examples/single_network_stereo_imx274_player.py。与之前一样，即使使用相同的网络接口，每个 receiver_operator 都是独立的。

GPIO 示例应用程序

此应用程序演示了如何使用 hololink GPIO 接口，可以在 hololink/examples 文件夹下找到。

hololink GPIO 接口支持 16 个 GPIO，编号为 0…15。这些 GPIO 可以设置为输入或输出，并具有 2 个逻辑值

高电平 - 可以在 GPIO 引脚上测量到 3.3V 电压
低电平 - 可以在 GPIO 引脚上测量到 0V 电压

下图映射了 hololink 板上的 GPIO 和接地引脚

可以从图像中观察到

连接器角落的 4 个引脚是接地引脚（在上面的图像中标记为“G”）
下部两个接地引脚之间的引脚组是 GPIO 引脚，编号为 0 到 7
上部两个接地引脚之间的引脚组是 GPIO 引脚，编号为 8 到 15

GPIO 示例应用程序流程

GPIO 示例是一个简单的应用程序，由 2 个操作符组成，如下面的图所示

%%{init: {"theme": "base", "themeVariables": { }} }%% graph w[GpioSetOp] --> r[GpioReadOp]

图 4 GPIO 示例流程

GPIO 设置操作符 - 此操作符遍历 16 个 GPIO 引脚，按照每 5 种不同引脚配置定义的值和方向逐个设置它们

ALL_OUT_L- 所有引脚输出低电平
ALL_OUT_H- 所有引脚输出高电平
ALL_IN - 所有引脚输入
ODD_OUT_H- 奇数引脚输出高电平，偶数引脚输入
EVEN_OUT_H - 偶数引脚输出高电平，奇数引脚输入

此操作符的每个周期都将一个引脚配置为方向和值，并将最后更改的引脚编号和当前运行配置发送到 GPIO 读取操作符。
一旦按照当前运行配置设置了所有 16 个引脚，操作符将在下一个周期移动到下一个配置。

GPIO 读取操作符 - 此操作符读取并显示最后配置的引脚的当前值。它延迟 10 秒，以允许用户使用外部测量设备（如万用表或示波器）验证引脚电平和方向。

GPIO 软件接口

GPIO 接口是在 hololink 模块中定义的类。它导出以下 GPIO 接口

get_gpio() - 从 hololink 模块获取 GPIO 接口实例
set_direction( pin, direction ) - 将引脚方向设置为输入或输出
get_direction( pin ) - 获取为引脚设置的方向
set_value( pin, value ) - 对于设置为输出方向的引脚，将引脚的值设置为高电平或低电平。
get_value( pin ) - 对于设置为输入方向的引脚，读取引脚的值（高电平或低电平）。

引脚编号 - 范围在 0 到 15 之间。
引脚方向 - 枚举值：IN-1,OUT-0
引脚值 - 枚举值：HIGH-1,LOW-0

NVIDIA ISP 用于实时捕获

Jetson 板载有内置的 ISP（图像信号处理）单元，用于处理拜耳图像并输出标准色彩空间的图像。ISP 在 Jetson Orin AGX 和 Orin IGX 的 iGPU 配置中运行。

examples/linux_hwisp_player.py 中提供的示例 ISP 应用程序配置了以下 pipeline。当 pipeline 的循环完成时，执行会在顶部重新开始，在那里获取和处理新数据。

%%{init: {"theme": "base", "themeVariables": { }} }%% graph r[LinuxReceiverOperator] --> c[CsiToBayerOp] c --> i[ArgusIspOp] i --> v[HolovizOp]

图 5 Linux ISP 播放器

ArgusIspOp 允许用户通过 Argus API 访问 ISP。此操作符接收每个像素 uint16（MSB 对齐）的拜耳未压缩图像，并输出 RGB888 图像。它作为带有 Python 绑定的 C++ 操作符提供。

可以使用应用程序级别的以下必需参数配置 ArgusIspOp。以下是现有基于 Python 的示例的片段。

Python

复制
已复制！

            
            argus_isp = hololink_module.operators.ArgusIspOp(
        self,
        name="argus_isp",
        bayer_format=bayer_format.value, # RGGB or other Bayer format
        exposure_time_ms=16.67,          # Exposure time in milliseconds. 60fps is 16.67ms
        analog_gain=10.0,                # Minimum Analog Gain
        pixel_bit_depth=10,              # Effective bit depth of input per pixel
        pool=isp_pool,
    )

ArgusIspOp 的输入是每个像素未压缩为 uint16 的拜耳图像。uint16 中的值应 MSB 对齐。例如，如果摄像头传感器产生 Raw10，则 10 位应在 16 位中 MSB 对齐。目前 ArgusIspOp 支持的输出是 Rec 709 标准色彩空间和伽玛校正的 RGB888。

在 Jetson Orin AGX 上，对于 1920x1080 分辨率、60 fps 的上述 pipeline 的端到端显示延迟为 37 毫秒。

有关 ISP 功能和用法的更多问题，请联系 NVIDIA。