其他安装方法#

Pip#

pip install tritonclient

perf_analyzer -m <model>

警告:如果缺少任何运行时依赖项,Perf Analyzer 将产生错误,显示缺少哪些依赖项。您需要手动安装它们。

从源代码构建#

Triton SDK 容器用于构建,因此一些构建和运行时依赖项已经安装。

export RELEASE=<yy.mm> # e.g. to use the release from the end of February of 2023, do `export RELEASE=23.02`

docker pull nvcr.io/nvidia/tritonserver:${RELEASE}-py3-sdk

docker run --gpus all --rm -it --net host nvcr.io/nvidia/tritonserver:${RELEASE}-py3-sdk

# inside container
# prep installing newer version of cmake
apt update && apt install -y gpg wget && wget -O - https://apt.kitware.com/keys/kitware-archive-latest.asc 2>/dev/null | gpg --dearmor - | tee /usr/share/keyrings/kitware-archive-keyring.gpg >/dev/null && . /etc/os-release && echo "deb [signed-by=/usr/share/keyrings/kitware-archive-keyring.gpg] https://apt.kitware.com/ubuntu/ $UBUNTU_CODENAME main" | tee /etc/apt/sources.list.d/kitware.list >/dev/null

# install build/runtime dependencies
apt update && apt install -y cmake-data=3.27.7* cmake=3.27.7* libcurl4-openssl-dev rapidjson-dev

rm -rf perf_analyzer ; git clone --depth 1 https://github.com/triton-inference-server/perf_analyzer

mkdir perf_analyzer/build ; cd perf_analyzer/build

cmake ..

make -j8 perf-analyzer

perf_analyzer/src/perf-analyzer-build/perf_analyzer -m <model>
  • 要启用 CUDA 共享内存,请将 -DTRITON_ENABLE_GPU=ON 添加到 cmake 命令。

  • 要启用 C API 模式,请将 -DTRITON_ENABLE_PERF_ANALYZER_C_API=ON 添加到 cmake 命令。

  • 要启用 TorchServe 后端,请将 -DTRITON_ENABLE_PERF_ANALYZER_TS=ON 添加到 cmake 命令。

  • 要启用 Tensorflow Serving 后端,请将 -DTRITON_ENABLE_PERF_ANALYZER_TFS=ON 添加到 cmake 命令。