高级用法
本节为高级用户详细介绍推理脚本。
Studio Voice NIM 使用 gRPC 端点。导入已编译的 gRPC protos 以调用 NIM。
import os
import sys
import grpc
sys.path.append(os.path.join(os.getcwd(), "../interfaces/studio_voice"))
# Importing gRPC compiler auto-generated maxine studiovoice library
import studiovoice_pb2, studiovoice_pb2_grpc # noqa: E402
NIM 调用使用双向 gRPC 流。要生成请求数据流,请定义一个 Python 生成器函数。这也称为 Python 迭代器形式,一个在调用后产生结果的简单函数。yield 返回要流式传输的块。
def generate_request_for_inference(input_filepath: os.PathLike) -> None:
"""Generator to produce the request data stream
Args:
input_filepath: Path to input file
"""
DATA_CHUNKS = 64 * 1024 # bytes, we send the wav file in 64KB chunks
with open(input_filepath, "rb") as fd:
while True:
buffer = fd.read(DATA_CHUNKS)
if buffer == b"":
break
yield studiovoice_pb2.EnhanceAudioRequest(audio_stream_data=buffer)
在调用 NIM 之前,定义一个处理传入流并将其写入输出文件的函数。
from typing import Iterator
def write_output_file_from_response(
response_iter: Iterator[studiovoice_pb2.EnhanceAudioResponse],
output_filepath: os.PathLike
) -> None:
"""Function to write the output file from the incoming gRPC data stream.
Args:
response_iter: Responses from the server to write into output file
output_filepath: Path to output file
"""
with open(output_filepath, "wb") as fd:
for response in response_iter:
if response.HasField("audio_stream_data"):
fd.write(response.audio_stream_data)
现在我们已经设置了请求生成器和输出迭代器,连接到 NIM 并调用它。输入文件路径存储在变量 input_filepath
中,输出文件写入到变量 output_filepath
中指定的位置。在检查输出文件之前,请等待确认函数调用已完成的消息。在下面的代码片段中填写目标的正确主机和端口
import time
input_filepath = "../assets/studio_voice_48k_input.wav"
output_filepath = "studio_voice_48k_output.wav"
with grpc.insecure_channel(target="localhost:8001") as channel:
try:
stub = studiovoice_pb2_grpc.MaxineStudioVoiceStub(channel)
start_time = time.time()
responses = stub.EnhanceAudio(
generate_request_for_inference(input_filepath=input_filepath),
metadata=None,
)
write_output_file_from_response(response_iter=responses, output_filepath=output_filepath)
end_time = time.time()
print(
f"Function invocation completed in {end_time-start_time:.2f}s, the output file is generated."
)
except BaseException as e:
print(e)
编译 Protos
NVIDIA Maxine NIM Clients 软件包附带预编译的 protos。但是,要在本地编译 protos,请安装所需的依赖项。
Linux
要在 Linux 上编译 protos,请运行
# Go to studio-voice/protos folder
cd studio-voice/protos
chmod +x compile_protos.sh
./compile_protos.sh
Windows
要在 Windows 上编译 protos,请运行
# Go to studio-voice/protos folder
cd studio-voice/protos
compile_protos.bat
模型缓存
当容器首次启动时,它将从 NGC 下载所需的模型。为了避免在后续运行时下载模型,您可以本地缓存它们,方法是使用缓存目录
# Create the cache directory on the host machine
export LOCAL_NIM_CACHE=~/.cache/nim
mkdir -p "$LOCAL_NIM_CACHE"
chmod 777 $LOCAL_NIM_CACHE
# Run the container with the cache directory mounted in the appropriate location
docker run -it --rm --name=maxine-studio-voice \
--net host \
--runtime=nvidia \
--gpus all \
--shm-size=8GB \
-e NGC_API_KEY=$NGC_API_KEY \
-e NIM_MODEL_PROFILE=<nim_model_profile> \
-v "$LOCAL_NIM_CACHE:/opt/nim/.cache" \
nvcr.io/nim/nvidia/maxine-studio-voice:latest
确保 nim_model_profile
与您的 GPU 兼容。有关 nim_model_profile
的更多信息,请参阅 NIM 模型配置文件表。