推荐安装方法#

Triton SDK 容器#

“安装” Perf Analyzer 的推荐方法是从 NVIDIA GPU Cloud Catalog 上的 Triton SDK docker 容器中运行预构建的可执行文件。只要 SDK 容器的网络暴露给推理服务器的地址和端口，Perf Analyzer 就能运行。

export RELEASE=<yy.mm> # e.g. to use the release from the end of February of 2023, do `export RELEASE=23.02`

docker pull nvcr.io/nvidia/tritonserver:${RELEASE}-py3-sdk

docker run --gpus all --rm -it --net host nvcr.io/nvidia/tritonserver:${RELEASE}-py3-sdk

# inside container
perf_analyzer -m <model>

其他安装方法#

Pip
从源代码构建

Pip#

pip install tritonclient

perf_analyzer -m <model>

警告：如果缺少任何运行时依赖项，Perf Analyzer 将产生错误，显示缺少哪些依赖项。您需要手动安装它们。

从源代码构建#

Triton SDK 容器用于构建，因此一些构建和运行时依赖项已经安装。

export RELEASE=<yy.mm> # e.g. to use the release from the end of February of 2023, do `export RELEASE=23.02`

docker pull nvcr.io/nvidia/tritonserver:${RELEASE}-py3-sdk

docker run --gpus all --rm -it --net host nvcr.io/nvidia/tritonserver:${RELEASE}-py3-sdk

# inside container
# prep installing newer version of cmake
apt update && apt install -y gpg wget && wget -O - https://apt.kitware.com/keys/kitware-archive-latest.asc 2>/dev/null | gpg --dearmor - | tee /usr/share/keyrings/kitware-archive-keyring.gpg >/dev/null && . /etc/os-release && echo "deb [signed-by=/usr/share/keyrings/kitware-archive-keyring.gpg] https://apt.kitware.com/ubuntu/ $UBUNTU_CODENAME main" | tee /etc/apt/sources.list.d/kitware.list >/dev/null

# install build/runtime dependencies
apt update && apt install -y cmake-data=3.27.7* cmake=3.27.7* libcurl4-openssl-dev rapidjson-dev

rm -rf perf_analyzer ; git clone --depth 1 https://github.com/triton-inference-server/perf_analyzer

mkdir perf_analyzer/build ; cd perf_analyzer/build

cmake ..

make -j8 perf-analyzer

perf_analyzer/src/perf-analyzer-build/perf_analyzer -m <model>

要启用 CUDA 共享内存，请将 -DTRITON_ENABLE_GPU=ON 添加到 cmake 命令。
要启用 C API 模式，请将 -DTRITON_ENABLE_PERF_ANALYZER_C_API=ON 添加到 cmake 命令。
要启用 TorchServe 后端，请将 -DTRITON_ENABLE_PERF_ANALYZER_TS=ON 添加到 cmake 命令。
要启用 Tensorflow Serving 后端，请将 -DTRITON_ENABLE_PERF_ANALYZER_TFS=ON 添加到 cmake 命令。