Morpheus 入门指南 - NVIDIA 文档

有三种方式可以开始使用 Morpheus

使用预构建的 Docker 容器
使用 Morpheus Conda 包
构建 Morpheus Docker 容器
从源代码构建 Morpheus

预构建的 Docker 容器是开始使用最新版本 Morpheus 的最简单方法。可以在 NGC 上找到已发布的 Morpheus 容器版本。

更高级的用户，或对使用最新的预发布功能感兴趣的用户，将需要构建 Morpheus 容器或从源代码构建。

要求

Volta 架构 GPU 或更高版本
CUDA 12.5
Docker
NVIDIA Container Toolkit
NVIDIA Triton Inference Server 24.09 或更高版本

关于 Docker 的注意事项

Morpheus 文档和示例假设已执行以非 root 用户身份管理 Docker 的安装后步骤，允许非 root 用户执行 Docker 命令。这并非绝对必要，只要当前用户拥有 sudo 权限来执行 Docker 命令即可。

使用预构建的 Docker 容器

拉取 Morpheus 镜像

访问 https://catalog.ngc.nvidia.com/orgs/nvidia/teams/morpheus/containers/morpheus/tags
选择一个版本。

下载所选版本，例如对于 24.10

复制
已复制！

            
            docker pull nvcr.io/nvidia/morpheus/morpheus:24.10-runtime

可选：许多示例需要 NVIDIA Triton Inference Server 与包含的模型一起运行。要下载 Morpheus Triton Server Models 容器，请确保版本号与您在上一步中下载的 Morpheus 容器的版本号匹配，然后运行
复制

已复制！
```
            
            docker pull nvcr.io/nvidia/morpheus/morpheus-tritonserver-models:24.10
        
```

关于 Morpheus 版本的注意事项

Morpheus 使用日历版本控制 (CalVer)。对于每个 Morpheus 版本，都会有一个标记为 YY.MM-runtime 形式的镜像。此标记将始终指向该版本的最新修订版本。此外，还将至少有一个标记为 vYY.MM.00-runtime 形式的修订版本。这将是该版本的初始修订版本（例如，v24.10.00-runtime）。如果发生重大错误，我们可能会发布其他修订版本（例如，v24.10.01-runtime，v24.10.02-runtime 等…），并且 YY.MM-runtime 标记将更新为引用该修订版本。

希望确保他们运行的是最新错误修复的用户应使用发布镜像标记 (YY.MM-runtime)。需要在生产环境中部署特定版本的用户应使用修订版本镜像标记 (vYY.MM.00-runtime)。

启动 Morpheus 容器

确保已安装NVIDIA Container Toolkit。
启动从上一节下载的容器

复制
已复制！

            
            docker run --rm -ti --runtime=nvidia --gpus=all --net=host -v /var/run/docker.sock:/var/run/docker.sock nvcr.io/nvidia/morpheus/morpheus:24.10-runtime bash

关于上面某些标志的注意事项

标志	描述
`--runtime=nvidia`	选择 NVIDIA docker 运行时。这使容器内部可以访问 GPU。如果已将 `nvidia` 运行时设置为 Docker 的默认运行时，则不需要此标志。
`--gpus=all`	指定容器可以访问哪些 GPU。或者，可以使用 `--gpus=<gpu-id>` 选择特定的 GPU
`--net=host`	大多数 Morpheus 管道利用 NVIDIA Triton Inference Server，它将在另一个容器中运行。为了简单起见，我们将允许容器访问主机系统的网络；生产部署可能会选择显式的网络配置。
`-v /var/run/docker.sock:/var/run/docker.sock`	允许从正在运行的容器内部访问 Docker 套接字文件。这允许从 Morpheus 容器内部启动其他 Docker 容器。启动 Triton 并访问包含的 Morpheus 模型时，此标志是必需的。拥有自己模型的用户可以忽略此标志。

启动后，希望使用包含的 Morpheus 模型启动 Triton 的用户将需要在 Morpheus 容器中安装 Docker 工具，方法是运行

复制
已复制！

            
            ./external/utilities/docker/install_docker.sh

跳至获取 Morpheus Models 容器部分。

使用 Morpheus Conda 包

Morpheus 阶段以库的形式提供，这些库托管在 nvidia Conda 频道上。Morpheus Conda 包包括：morpheus-core、morpheus-dfp 和 morpheus-llm。

有关这些库及其使用方法的详细信息，请参阅Morpheus Conda 包指南。

构建 Morpheus 容器

克隆存储库

复制
已复制！

            
            MORPHEUS_ROOT=$(pwd)/morpheus
git clone https://github.com/nv-morpheus/Morpheus.git $MORPHEUS_ROOT
cd $MORPHEUS_ROOT

Git LFS

此存储库中的大型模型和数据文件使用 Git Large File Storage (LFS) 存储。默认情况下，克隆存储库时仅下载严格运行 Morpheus 所需的那些文件。

scripts/fetch_data.py 脚本可用于获取 Morpheus 预训练模型，以及运行训练/验证脚本和示例管道所需的其他文件。

如果在运行管道时发生任何与数据相关的问题，则应在容器外部重新运行该脚本。

脚本的用法如下

复制
已复制！

            
            scripts/fetch_data.py fetch <dataset> [<dataset>...]

在编写本文时，定义的 datasets 是

all - 元数据集，包括所有其他数据集
datasets - 许多示例所需的输入文件
docs - 文档所需的图形
examples - examples 目录中脚本所需的数据
models - Morpheus 模型（最大的数据集）
tests - 单元测试使用的数据
validation - 某些单元测试所需的 models 数据集的子集

要仅下载示例和模型

复制
已复制！

            
            scripts/fetch_data.py fetch examples models

要下载单元测试所需的数据

复制
已复制！

            
            scripts/fetch_data.py fetch tests validation

如果在克隆存储库之前未安装 Git LFS，则 scripts/fetch_data.py 脚本将失败。如果发生这种情况，请按照此处的说明安装 Git LFS，然后运行以下命令

复制
已复制！

            
            git lfs install

构建容器

为了帮助构建 Morpheus 容器，./docker 目录中提供了多个脚本。要构建“发布”容器，请运行以下命令

复制
已复制！

            
            ./docker/build_container_release.sh

默认情况下，这将创建一个名为 nvcr.io/nvidia/morpheus/morpheus:${MORPHEUS_VERSION}-runtime 的镜像，其中 $MORPHEUS_VERSION 被 git describe --tags --abbrev=0 的输出替换。您可以通过分别将 DOCKER_IMAGE_NAME 和 DOCKER_IMAGE_TAG 环境变量传递给脚本来指定不同的 Docker 镜像名称和标签。

要运行构建的“发布”容器，请使用以下命令

复制
已复制！

            
            ./docker/run_container_release.sh

./docker/run_container_release.sh 脚本接受与 ./docker/build_container_release.sh 脚本相同的 DOCKER_IMAGE_NAME 和 DOCKER_IMAGE_TAG 环境变量。例如，要运行版本 v24.10.00，请使用以下命令

复制
已复制！

            
            DOCKER_IMAGE_TAG="v24.10.00-runtime" ./docker/run_container_release.sh

获取 Morpheus Models 容器

许多验证测试和示例工作流程都需要 Triton 服务器才能运行。为了简单起见，Morpheus 提供了一个预构建的模型容器，其中包含 Triton 和 Morpheus 模型。实施 Morpheus 发布版本的用户可以使用以下命令从 NGC 下载相应的 Triton 模型容器

复制
已复制！

            
            docker pull nvcr.io/nvidia/morpheus/morpheus-tritonserver-models:24.10

使用未发布开发版本的 Morpheus 的用户可以从 Morpheus 存储库构建 Triton 模型容器。要构建 Triton 模型容器，请从 Morpheus 存储库的根目录运行以下命令

复制
已复制！

            
            models/docker/build_container.sh

启动 Triton 服务器

在新终端中，使用以下命令启动 Docker 容器以供 Triton 加载所有包含的预训练模型

复制
已复制！

            
            docker run --rm -ti --gpus=all -p8000:8000 -p8001:8001 -p8002:8002 \
  nvcr.io/nvidia/morpheus/morpheus-tritonserver-models:24.10 \
  tritonserver --model-repository=/models/triton-model-repo \
    --exit-on-error=false \
    --log-info=true \
    --strict-readiness=false \
    --disable-auto-complete-config

这将使用默认网络端口（HTTP 为 8000，GRPC 为 8001，指标为 8002）启动 Triton，并加载 Morpheus 存储库中的所有示例模型。

注意：以上命令对于测试 Morpheus 很有用，但是它确实会将多个模型加载到 GPU 内存中，在撰写本文时，这大约消耗 2GB 的 GPU 内存。生产用户应考虑仅加载他们计划与 --model-control-mode=explicit 和 --load-model 标志一起使用的特定模型。例如，要启动 Triton 仅加载 abp-nvsmi-xgb 模型

复制
已复制！

            
            docker run --rm -ti --gpus=all -p8000:8000 -p8001:8001 -p8002:8002 \
  nvcr.io/nvidia/morpheus/morpheus-tritonserver-models:24.10  \
  tritonserver --model-repository=/models/triton-model-repo \
    --exit-on-error=false \
    --log-info=true \
    --strict-readiness=false \
    --disable-auto-complete-config \
    --model-control-mode=explicit \
    --load-model abp-nvsmi-xgb

或者，对于已检出 Morpheus git 存储库的用户，可以直接启动 Triton 服务器容器，从存储库挂载模型。对于训练自己模型的用户来说，此方法最有用。从 Morpheus 存储库的根目录中，使用以下命令启动 Docker 容器以供 Triton 加载所有包含的预训练模型

复制
已复制！

            
            docker run --rm -ti --gpus=all -p8000:8000 -p8001:8001 -p8002:8002 \
  -v $PWD/models:/models \
  nvcr.io/nvidia/tritonserver:24.09-py3 \
  tritonserver --model-repository=/models/triton-model-repo \
    --exit-on-error=false \
    --log-info=true \
    --strict-readiness=false \
    --disable-auto-complete-config \
    --model-control-mode=explicit \
    --load-model abp-nvsmi-xgb

运行 Morpheus

要运行 Morpheus，用户需要从 Morpheus 命令行界面 (CLI) 或 Python 界面中选择。使用哪个界面取决于用户的需求、自定义量和操作环境。有关每个界面的更多信息，请参见下文。

有关使用 Python API 和命令行界面的完整示例管道，请参阅Morpheus 示例。

Morpheus Python 界面

Morpheus Python 界面允许用户使用 Python 脚本文件配置其管道。这非常适合在 Jupyter Notebook 中工作的用户以及需要复杂初始化逻辑的用户。有关使用 Morpheus Python 和 C++ API 的文档，请参阅Morpheus 开发者指南。

Morpheus 命令行界面 (CLI)

CLI 允许用户直接从终端完全配置 Morpheus 管道。这非常适合在 Kubernetes 中配置管道的用户。可以使用 morpheus 命令调用 Morpheus CLI，并且能够运行线性管道以及其他工具。有关使用 CLI 的说明，可以直接在终端中使用 morpheus --help 查询

复制
已复制！

            
            $ morpheus --help
Usage: morpheus [OPTIONS] COMMAND [ARGS]...

Options:
  --debug / --no-debug            [default: no-debug]
  --log_level [CRITICAL|FATAL|ERROR|WARN|WARNING|INFO|DEBUG]
                                  Specify the logging level to use.  [default:
                                  WARNING]
  --log_config_file FILE          Config file to use to configure logging. Use
                                  only for advanced situations. Can accept
                                  both JSON and ini style configurations
  --plugin TEXT                   Adds a Morpheus CLI plugin. Can either be a
                                  module name or path to a python module
  --version                       Show the version and exit.
  --help                          Show this message and exit.

Commands:
  run    Run one of the available pipelines
  tools  Run a utility tool

CLI 中的每个命令都有其自己的帮助信息。使用 morpheus [command] [...sub-command] --help 获取每个命令和子命令的说明。例如

复制
已复制！

            
            $ morpheus run pipeline-nlp inf-triton --help
Configuring Pipeline via CLI
Usage: morpheus run pipeline-nlp inf-triton [OPTIONS]

Options:
  --model_name TEXT               Model name in Triton to send messages to
                                  [required]
  --server_url TEXT               Triton server URL (IP:Port)  [required]
  --force_convert_inputs BOOLEAN  Instructs this stage to forcibly convert all
                                  input types to match what Triton is
                                  expecting. Even if this is set to `False`,
                                  automatic conversion will be done only if
                                  there would be no data loss (i.e. int32 ->
                                  int64).  [default: False]
  --use_shared_memory BOOLEAN     Whether or not to use CUDA Shared IPC Memory
                                  for transferring data to Triton. Using CUDA
                                  IPC reduces network transfer time but
                                  requires that Morpheus and Triton are
                                  located on the same machine  [default:
                                  False]
  --help                          Show this message and exit.  [default:
                                  False]

有关使用 Morpheus CLI 的多个示例，请参阅基本用法指南以及其他Morpheus 示例。

CLI 阶段配置

通过 CLI 配置管道时，您从命令 morpheus run pipeline 开始，然后按从开始到结束的顺序列出阶段。命令放置的顺序将是数据从开始到结束的流动的顺序。每个阶段的输出将链接到下一个阶段的输入。例如，要构建一个简单的管道，该管道从 Kafka 读取数据，反序列化消息，序列化它们，然后写入文件，请使用以下命令

复制
已复制！

            
            morpheus --log_level=INFO run pipeline-nlp from-kafka --bootstrap_servers localhost:9092 --input_topic test_pcap deserialize serialize to-file --filename .tmp/temp_out.json --overwrite

输出应包含类似于以下的行

复制
已复制！

            
            ====Building Segment: linear_segment_0====
Added source: <from-kafka-0; KafkaSourceStage(bootstrap_servers=localhost:9092, input_topic=('test_pcap',), group_id=morpheus, client_id=None, poll_interval=10millis, disable_commit=False, disable_pre_filtering=False, auto_offset_reset=AutoOffsetReset.LATEST, stop_after=0, async_commits=True)>
  └─> morpheus.MessageMeta
Added stage: <deserialize-1; DeserializeStage(ensure_sliceable_index=True)>
  └─ morpheus.MessageMeta -> morpheus.ControlMessage
Added stage: <serialize-2; SerializeStage(include=(), exclude=('^ID$', '^_ts_'), fixed_columns=True)>
  └─ morpheus.ControlMessage -> morpheus.MessageMeta
Added stage: <to-file-3; WriteToFileStage(filename=.tmp/temp_out.json, overwrite=True, file_type=FileTypes.Auto, include_index_col=True, flush=False)>
  └─ morpheus.MessageMeta -> morpheus.MessageMeta
====Building Segment Complete!====
====Pipeline Started====

这很重要，因为当日志级别设置为 INFO 及以上时，它会显示阶段的顺序以及每个阶段的输出类型。由于某些阶段无法接受所有类型的输入，因此如果您错误地配置了管道，Morpheus 将报告错误。例如，如果我们运行与上面相同的命令，但忘记了 serialize 阶段，Morpheus 应输出类似于以下的错误

复制
已复制！

            
            $ morpheus run pipeline-nlp from-kafka --bootstrap_servers localhost:9092 --input_topic test_pcap deserialize to-file --filename .tmp/temp_out.json --overwrite
Configuring Pipeline via CLI
Starting pipeline via CLI... Ctrl+C to Quit
E20221214 14:53:17.425515 452045 controller.cpp:62] exception caught while performing update - this is fatal - issuing kill
E20221214 14:53:17.425714 452045 context.cpp:125] rank: 0; size: 1; tid: 140065439217216; fid: 0x7f6144041000: set_exception issued; issuing kill to current runnable. Exception msg: RuntimeError: The to-file stage cannot handle input of <class 'morpheus.messages.ControlMessage'>. Accepted input types: (<class 'morpheus.messages.message_meta.MessageMeta'>,)

这表明 to-file 阶段无法接受 morpheus.messages.ControlMessage 的输入类型。这是因为 to-file 阶段不知道如何将该类写入文件；它只知道如何写入 morpheus.messages.message_meta.MessageMeta 类型的消息。要确保您拥有有效的管道，请检查错误消息的 Accepted input types: (<class 'morpheus.messages.message_meta.MessageMeta'>,) 部分。这表明您需要一个阶段将 deserialize 阶段的输出类型 morpheus.messages.ControlMessage 转换为 morpheus.messages.message_meta.MessageMeta，这正是 serialize 阶段所做的。

管道阶段

CLI 允许轻松查询每种管道类型的可用阶段。有关 Morpheus 中包含的阶段的更广泛文档，请参阅Morpheus 阶段。

注意：虽然大多数阶段都可以从 CLI 获得，但 CLI 中无法使用一小部分阶段以及阶段的配置选项，这些阶段只能通过 Python API 使用。

复制
已复制！

            
            $ morpheus run pipeline-nlp --help
Usage: morpheus run pipeline-nlp [OPTIONS] COMMAND1 [ARGS]... [COMMAND2 [ARGS]...]...

<Help Paragraph Omitted>

Commands:
  add-class     Add detected classifications to each message.
  add-scores    Add probability scores to each message.
  buffer        (Deprecated) Buffer results.
  delay         (Deprecated) Delay results for a certain duration.
  deserialize   Messages are logically partitioned based on the pipeline config's `pipeline_batch_size` parameter.
  dropna        Drop null data entries from a DataFrame.
  filter        Filter message by a classification threshold.
  from-doca     A source stage used to receive raw packet data from a ConnectX-6 Dx NIC.
  from-file     Load messages from a file.
  from-kafka    Load messages from a Kafka cluster.
  gen-viz       (Deprecated) Write out visualization DataFrames.
  inf-identity  Perform inference for testing that performs a no-op.
  inf-pytorch   Perform inference with PyTorch.
  inf-triton    Perform inference with Triton Inference Server.
  mlflow-drift  Report model drift statistics to ML Flow.
  monitor       Display throughput numbers at a specific point in the pipeline.
  preprocess    Prepare NLP input DataFrames for inference.
  serialize     Includes & excludes columns from messages.
  to-file       Write all messages to a file.
  to-kafka      Write all messages to a Kafka cluster.
  trigger       Buffer data until the previous stage has completed.
  validate      Validate pipeline output for testing.

对于 FIL 管道

复制
已复制！

            
            $ morpheus run pipeline-fil --help
Usage: morpheus run pipeline-fil [OPTIONS] COMMAND1 [ARGS]... [COMMAND2 [ARGS]...]...

<Help Paragraph Omitted>

Commands:
  add-class       Add detected classifications to each message.
  add-scores      Add probability scores to each message.
  buffer          (Deprecated) Buffer results.
  delay           (Deprecated) Delay results for a certain duration.
  deserialize     Messages are logically partitioned based on the pipeline config's `pipeline_batch_size` parameter.
  dropna          Drop null data entries from a DataFrame.
  filter          Filter message by a classification threshold.
  from-appshield  Source stage is used to load Appshield messages from one or more plugins into a DataFrame. It normalizes nested json messages and arranges them
                  into a DataFrame by snapshot and source.
  from-file       Load messages from a file.
  from-kafka      Load messages from a Kafka cluster.
  inf-identity    Perform inference for testing that performs a no-op.
  inf-pytorch     Perform inference with PyTorch.
  inf-triton      Perform inference with Triton Inference Server.
  mlflow-drift    Report model drift statistics to ML Flow.
  monitor         Display throughput numbers at a specific point in the pipeline.
  preprocess      Prepare FIL input DataFrames for inference.
  serialize       Includes & excludes columns from messages.
  to-file         Write all messages to a file.
  to-kafka        Write all messages to a Kafka cluster.
  trigger         Buffer data until the previous stage has completed.
  validate        Validate pipeline output for testing.

注意：不同类型管道的可用命令不相同。这意味着同一阶段在不同管道中使用时可能具有不同的选项。在开发期间，请查看 CLI 帮助以获取最新信息。

下一步

Morpheus 示例 - 使用 Python API 和命令行界面的示例管道
Morpheus 预训练模型 - 预训练模型以及相应的训练、验证脚本和数据集
Morpheus 开发者指南 - 有关使用 Morpheus Python 和 C++ API 的文档