推荐安装方法#
Triton SDK 容器#
“安装” Perf Analyzer 的推荐方法是从 NVIDIA GPU Cloud Catalog 上的 Triton SDK docker 容器中运行预构建的可执行文件。只要 SDK 容器的网络暴露给推理服务器的地址和端口,Perf Analyzer 就能运行。
export RELEASE=<yy.mm> # e.g. to use the release from the end of February of 2023, do `export RELEASE=23.02`
docker pull nvcr.io/nvidia/tritonserver:${RELEASE}-py3-sdk
docker run --gpus all --rm -it --net host nvcr.io/nvidia/tritonserver:${RELEASE}-py3-sdk
# inside container
perf_analyzer -m <model>
其他安装方法#
Pip#
pip install tritonclient
perf_analyzer -m <model>
警告:如果缺少任何运行时依赖项,Perf Analyzer 将产生错误,显示缺少哪些依赖项。您需要手动安装它们。
从源代码构建#
Triton SDK 容器用于构建,因此一些构建和运行时依赖项已经安装。
export RELEASE=<yy.mm> # e.g. to use the release from the end of February of 2023, do `export RELEASE=23.02`
docker pull nvcr.io/nvidia/tritonserver:${RELEASE}-py3-sdk
docker run --gpus all --rm -it --net host nvcr.io/nvidia/tritonserver:${RELEASE}-py3-sdk
# inside container
# prep installing newer version of cmake
apt update && apt install -y gpg wget && wget -O - https://apt.kitware.com/keys/kitware-archive-latest.asc 2>/dev/null | gpg --dearmor - | tee /usr/share/keyrings/kitware-archive-keyring.gpg >/dev/null && . /etc/os-release && echo "deb [signed-by=/usr/share/keyrings/kitware-archive-keyring.gpg] https://apt.kitware.com/ubuntu/ $UBUNTU_CODENAME main" | tee /etc/apt/sources.list.d/kitware.list >/dev/null
# install build/runtime dependencies
apt update && apt install -y cmake-data=3.27.7* cmake=3.27.7* libcurl4-openssl-dev rapidjson-dev
rm -rf perf_analyzer ; git clone --depth 1 https://github.com/triton-inference-server/perf_analyzer
mkdir perf_analyzer/build ; cd perf_analyzer/build
cmake ..
make -j8 perf-analyzer
perf_analyzer/src/perf-analyzer-build/perf_analyzer -m <model>
要启用 CUDA 共享内存,请将
-DTRITON_ENABLE_GPU=ON
添加到cmake
命令。要启用 C API 模式,请将
-DTRITON_ENABLE_PERF_ANALYZER_C_API=ON
添加到cmake
命令。要启用 TorchServe 后端,请将
-DTRITON_ENABLE_PERF_ANALYZER_TS=ON
添加到cmake
命令。要启用 Tensorflow Serving 后端,请将
-DTRITON_ENABLE_PERF_ANALYZER_TFS=ON
添加到cmake
命令。