TTS 部署
目录
TTS 部署#
本教程解释了生成 TTS RMIR(Riva 模型中间表示)的过程。RMIR 是一个中间文件,其中包含部署 Riva 服务所需的所有必要工件(模型、文件、配置和用户设置)。
学习目标#
在本教程中,您将学习如何:
使用 Riva ServiceMaker 获取两个
.riva
文件,并将其转换为.rmir
,用于AMD64
(数据中心,86_64
)或ARM64
(嵌入式,AArch64
)机器。对于拥有
.nemo
文件的用户,可以使用nemo2riva
从.nemo
检查点生成.riva
文件。
在本地 Riva 服务器上启动和部署
.rmir
。使用 Riva API 绑定从演示客户端发送推理请求。
Riva ServiceMaker#
ServiceMaker 是一组工具,用于聚合 Riva 部署到目标环境所需的所有必要工件(模型、文件、配置和用户设置)。它有两个主要组件:
riva-build
riva-deploy
第一步是 riva-build
,它可以在数据中心或嵌入式机器上运行,以构建 .rmir
文件。
第二步是 riva-deploy
,它应该在要提供 Riva 服务器的机器上运行。
如果您正在数据中心机器上构建 .rmir
文件以用于嵌入式部署,请按照本教程操作至 Riva-build 部分(包括该部分)。将构建的 .rmir
复制到目标嵌入式机器,运行 设置配置和参数部分,然后继续 Riva-deploy 部分。
Riva-build#
此步骤有助于构建 Riva 就绪的模型版本。它唯一的输出是一种中间格式(称为 Riva 模型中间表示 (.rmir
)),用于 Riva 中支持的服务的端到端管道。让我们考虑两个 TTS 模型:
riva-build
负责将一个或多个导出的模型(.riva
文件)组合成一个包含中间格式(称为 .rmir
)的单个文件。此文件包含整个端到端管道的部署无关规范,以及最终部署和推理所需的所有资产。有关更多信息,请参阅 Riva 文档。
Riva-deploy#
部署工具将一个或多个 .rmir
文件和一个目标模型存储库目录作为输入。它创建一个集合配置,指定执行管道,并最终将所有这些资产写入输出模型存储库目录。
设置配置和参数#
更新以下代码块中的参数:
machine_type
:运行本教程的机器类型。可接受的值为AMD64
、ARM64_linux
、ARM64_l4t
。默认为AMD64
。target_machine
:RMIR 将部署到的机器类型。可接受的值为AMD64
、ARM64_linux
、ARM64_l4t
。默认为AMD64
。acoustic_model
:声学模型.riva
文件的完整路径。默认为None
。可以使用自定义声学模型.riva
检查点替换它。vocoder
:声码器.riva
文件的完整路径。默认为None
。可以使用自定义声码器.riva
检查点替换它。out_dir
:用于放置TTS.rmir
文件的目录。RMIR 将放置在${out_dir}/RMIR/RMIR_NAME.rmir
中。默认为$pwd/out
。voice
:设置模型的语音名称。默认为"test"
。key
:这是nemo2riva
中使用的加密密钥。相同的密钥将用于部署本教程中生成的 RMIR。默认为tlt_encode
。use_ipa
:如果模型使用 IPA 音素,则设置为"y"
或"Y"
;如果模型使用 ARPAbet,则设置为"no"
。默认为"yes"
。lang
:模型语言。这仅用于客户端,对生成的语音没有影响。默认为"en-US"
。sample_rate
:生成的音频的采样率,单位为 Hz。默认为 44100。num_speakers
:模型中的说话人数量。默认为 2,NGC 示例模型中的说话人数量。
import pathlib
import logging
import warnings
from version import __riva_version__
machine_type="AMD64" #Change this to `ARM64_linux` or `ARM64_l4t` in case of an ARM64 machine.
target_machine="AMD64" #Change this to `ARM64_linux` or `ARM64_l4t` in case of an ARM64 machine.
acoustic_model = None ##acoustic_model .riva location
vocoder = None ##vocoder .riva location
out_dir = pathlib.Path.cwd() / "out" ##Output directory to store the generated RMIR. The RMIR will be placed in `${out_dir}/RMIR/RMIR_NAME.rmir`.
voice = "test" ##Voice name
key = "tlt_encode" ##Encryption key used during nemo2riva
use_ipa = "yes" ##`"y"` or `"Y"` if the model uses `ipa`, no otherwise.
lang = "en-US" ##Language
sample_rate = 44100 ##Sample rate of the audios
num_speakers = 2 ## Number of speakers
riva_aux_files = None ##Riva model repo path. In the case of a custom model repo, change this to the full path of the custom Riva model repo.
riva_tn_files = None ##Riva model repo path. In the case of a custom model repo, change this to the full path of the custom Riva model repo.
## Riva NGC, docker image config.
if machine_type.lower() in ["amd64", "arm64_linux"]:
riva_init_image = f"nvcr.io/nvidia/riva/riva-speech:{__riva_version__}"
elif machine_type.lower()=="arm64_l4t":
riva_init_image = f"nvcr.io/nvidia/riva/riva-speech:{__riva_version__}-l4t-aarch64"
rmir_dir = out_dir / "rmir"
if not out_dir.exists():
out_dir.mkdir()
if not rmir_dir.exists():
rmir_dir.mkdir()
def ngc_download_and_get_dir(ngc_resource_name, var, var_name, resource_type="model"):
default_download_folder = "_v".join(ngc_resource_name.split("/")[-1].split(":"))
!rm -rf ./riva_artifacts/{default_download_folder}
ngc_output = !ngc registry {resource_type} download-version {ngc_resource_name} --dest riva_artifacts
output = pathlib.Path(f"./riva_artifacts/{default_download_folder}")
if not output.exists():
ngc_output_formatted='\n'.join(ngc_output)
logging.error(
f"NGC was not able to download the requested model {ngc_resource_name}. "
"Please check the NGC error message, removed all directories, and re-start the "
f"notebook. NGC message: {ngc_output_formatted}"
)
return None
if "model" in resource_type:
riva_files_in_dir = list(output.glob("*.riva"))
if len(riva_files_in_dir) > 0:
output = riva_files_in_dir[0]
if output is not None and var is not None:
warnings.warn(
f"`{var_name}` had a non-default value of `{var}`. `{var_name}` will be updated to `{var}`"
)
return output
下载模型#
以下代码块将下载默认的 NGC 模型:FastPitch 和 HiFi-GAN。它们将被下载到名为 riva_artifacts
的文件夹中。如果当前文件夹已存在,则将其删除。
对于自定义模型,可以跳过代码块。
riva_ngc_artifacts = pathlib.Path.cwd() / "riva_artifacts"
if not riva_ngc_artifacts.exists():
riva_ngc_artifacts.mkdir()
acoustic_model = ngc_download_and_get_dir("nvidia/riva/speechsynthesis_en_us_fastpitch_ipa:deployable_v1.0", acoustic_model, "acoustic_model")
vocoder = ngc_download_and_get_dir("nvidia/riva/speechsynthesis_en_us_hifigan_ipa:deployable_v1.0", vocoder, "vocoder")
以下代码块将下载一些用于部署的其他 TTS 文件。这将包括以下文件:
ARPAbet 字典文件
IPA 字典文件
缩写映射文件
两个文本规范化 (TN) 文件
tokenize_and_classify.far
verbalize.far
riva_aux_files = ngc_download_and_get_dir("nvidia/riva/speechsynthesis_en_us_auxiliary_files:deployable_v1.3", riva_aux_files, "riva_aux_files")
riva_tn_files = ngc_download_and_get_dir("nvidia/riva/normalization_en_us:deployable_v1.1", riva_tn_files, "riva_tn_files")
运行 riva-build#
停止运行 Docker,运行 riva_server
,然后使用必要的路径再次运行。
##Run the riva servicemaker.
!docker stop riva_rmir_gen &> /dev/null
!set -x && docker run -td --gpus all --rm -v {str(riva_aux_files.resolve())}:/riva_aux \
-v {str(acoustic_model.parent.resolve())}/:/synt \
-v {str(vocoder.parent.resolve())}:/voc -v {str(riva_tn_files.resolve())}:/riva_tn \
-v {str(rmir_dir.resolve())}:/data --name riva_rmir_gen --entrypoint="/bin/bash" {riva_init_image}
warnings.warn("Using --force in riva-build will replace any existing RMIR.")
riva_build=(
f"riva-build speech_synthesis --force --voice_name={voice} --language_code={lang} "
f"--sample_rate={sample_rate} /data/FastPitch_HifiGan.rmir:{key} /synt/{str(acoustic_model.name)}:{key} "
f"/voc/{str(vocoder.name)}:{key} --abbreviations_file=/riva_aux/abbr.txt "
f"--wfst_tokenizer_model=/riva_tn/tokenize_and_classify.far --wfst_verbalizer_model=/riva_tn/verbalize.far"
)
if target_machine=="arm":
riva_build += """--max_batch_size 1 --postprocessor.max_batch_size 1 --preprocessor.max_batch_size 1 \
--encoderFastPitch.max_batch_size 1 --chunkerFastPitch.max_batch_size 1 --hifigan.max_batch_size 1"""
if use_ipa.lower() in ["y", "yes"]:
riva_build+=" --phone_set=ipa --phone_dictionary_file=/riva_aux/ipa_cmudict-0.7b_nv22.08.txt --upper_case_chars=True"
else:
riva_build+=" --phone_set=arpabet --phone_dictionary_file=/riva_aux/cmudict-0.7b_nv22.08"
if num_speakers > 1:
riva_build+=f" --num_speakers={num_speakers}"
riva_build+=" --subvoices " + ",".join([f"{i}:{i}" for i in range(num_speakers)])
print(riva_build)
执行 riva build 命令并停止 riva_server 容器。
!docker exec riva_rmir_gen {riva_build}
!docker stop riva_rmir_gen
运行 riva-deploy#
到目前为止,在本教程中,我们学习了如何从 .riva 文件生成 RMIR 文件。我们将看到 FastPitch_HifiGan.rmir
已在我们先前定义的 ${out_dir}/rmir
位置生成。
可以使用 riva_quickstart 部署本教程中生成的 RMIR 文件。
部署 RMIR 的步骤#
下载 Riva 快速入门资源
打开
config.sh
并更新以下参数:将
service_enabled_asr
设置为false
。将
service_enabled_nlp
设置为false
。将
service_enabled_tts
设置为true
。riva_model_loc
设置为您的out_dir
位置。将
use_existing_rmirs
设置为true
。
运行
riva_init.sh
。运行
riva_start.sh
。
让我们从 NGC 下载 Riva 快速入门资源。
if target_machine.lower() in ["amd64", "arm64_linux"]:
quickstart_link = f"nvidia/riva/riva_quickstart:{__riva_version__}"
else:
quickstart_link = f"nvidia/riva/riva_quickstart_arm64:{__riva_version__}"
quickstart_dir = ngc_download_and_get_dir(quickstart_link, None, None, resource_type="resource")
接下来,我们修改 config.sh
文件以启用相关的 Riva 服务(在本例中为 FastPitch 和 HiFi-GAN 的 TTS),并提供加密密钥和模型存储库路径 (riva_model_loc),该路径在之前的步骤中生成。
例如,如果上述模型存储库在 ${out_dir}/rmir
生成,那么您可以将 riva_model_loc
指定为与 ${out_dir}/rmir
相同的目录
以下是 config.sh
应有的样子:
config.sh 代码片段#
# Enable or Disable Riva Services
service_enabled_asr=false ## MAKE CHANGES HERE
service_enabled_nlp=false ## MAKE CHANGES HERE
service_enabled_tts=true ## MAKE CHANGES HERE
# Specify one or more GPUs to use
# specifying more than one GPU is currently an experimental feature, and may result in undefined behaviours.
gpus_to_use="device=0"
# Specify the encryption key to use to deploy models
MODEL_DEPLOY_KEY="tlt_encode" ## MAKE CHANGES HERE
# Locations to use for storing models artifacts
#
# If an absolute path is specified, the data will be written to that location
# Otherwise, a docker volume will be used (default).
#
# riva_init.sh will create a `rmir` and `models` directory in the volume or
# path specified.
#
# RMIR ($riva_model_loc/rmir)
# Riva uses an intermediate representation (RMIR) for models
# that are ready to deploy but not yet fully optimized for deployment. Pretrained
# versions can be obtained from NGC (by specifying NGC models below) and will be
# downloaded to $riva_model_loc/rmir by `riva_init.sh`
#
# Custom models produced by NeMo and prepared using riva-build
# may also be copied manually to this location $(riva_model_loc/rmir).
#
# Models ($riva_model_loc/models)
# During the riva_init process, the RMIR files in $riva_model_loc/rmir
# are inspected and optimized for deployment. The optimized versions are
# stored in $riva_model_loc/models. The riva server exclusively uses these
# optimized versions.
riva_model_loc="<add path>" ## MAKE CHANGES HERE (Replace with MODEL_LOC)
# The default RMIRs are downloaded from NGC by default in the above $riva_rmir_loc directory
# If you'd like to skip the download from NGC and use the existing RMIRs in the $riva_rmir_loc
# then set the below $use_existing_rmirs flag to true. You can also deploy your set of custom
# RMIRs by keeping them in the riva_rmir_loc dir and use this quickstart script with the
# below flag to deploy them all together.
use_existing_rmirs=false ## MAKE CHANGES HERE (Set to true)
让我们对 config.sh
进行必要的更改。
with open(f"{quickstart_dir}/config.sh", "r") as config_in:
config_file = config_in.readlines()
for i, line in enumerate(config_file):
# Disable services
if "service_enabled_asr" in line:
config_file[i] = "service_enabled_asr=false\n"
elif "service_enabled_nlp" in line:
config_file[i] = "service_enabled_nlp=false\n"
elif "service_enabled_nmt" in line:
config_file[i] = "service_enabled_nmt=false\n"
elif "service_enabled_tts" in line:
config_file[i] = "service_enabled_tts=true\n"
# Update riva_model_loc to our rmir folder
elif "riva_model_loc" in line:
config_file[i] = config_file[i].split("riva_model_loc")[0]+f"riva_model_loc={out_dir}\n"
elif "use_existing_rmirs" in line:
config_file[i] = "use_existing_rmirs=true\n"
elif "MODEL_DEPLOY_KEY" in line:
config_file[i] = f"MODEL_DEPLOY_KEY=\"{key}\"\n"
elif "fastpitch" in line:
config_file[i] = f"#{line}"
with open(f"{quickstart_dir}/config.sh", "w") as config_in:
config_in.writelines(config_file)
print("".join(config_file))
# Ensure you have permission to execute these scripts
! cd {quickstart_dir} && chmod +x ./riva_init.sh && chmod +x ./riva_start.sh && chmod +x ./riva_stop.sh
! cd {quickstart_dir} && ./riva_stop.sh config.sh
# Run `riva_init.sh`. This will fetch the containers/models and run `riva-deploy`.
# YOU CAN SKIP THIS STEP IF YOU DID RIVA DEPLOY
! cd {quickstart_dir} && ./riva_init.sh config.sh
# Run `riva_start.sh`. This will start the Riva server and serve your model.
! cd {quickstart_dir} && ./riva_start.sh config.sh
运行推理#
一旦 Riva 服务器启动并运行您的模型,您就可以发送推理请求来查询服务器。
要发送 gRPC 请求,请为客户端安装 Riva Python API 绑定。
# Install client API bindings
! pip install nvidia-riva-client
连接到 Riva 服务器并运行推理#
现在,我们可以查询 Riva 服务器了;让我们开始吧。以下单元格查询 Riva 服务器(使用 gRPC)以产生结果。
import os
import riva.client
import IPython.display as ipd
import numpy as np
server = "localhost:50051" # location of riva server
auth = riva.client.Auth(uri=server)
tts_service = riva.client.SpeechSynthesisService(auth)
text = "Is it recognize speech or wreck a nice beach?"
language_code = lang # currently required to be "en-US"
sample_rate_hz = sample_rate # the desired sample rate
voice_name = voice # subvoice to generate the audio output.
data_type = np.int16 # For RIVA version < 1.10.0 please set this to np.float32
resp = tts_service.synthesize(text, voice_name=voice_name, language_code=language_code, sample_rate_hz=sample_rate_hz)
audio = resp.audio
meta = resp.meta
processed_text = meta.processed_text
predicted_durations = meta.predicted_durations
audio_samples = np.frombuffer(resp.audio, dtype=data_type)
print(processed_text)
ipd.Audio(audio_samples, rate=sample_rate_hz)