gRPC 与 Audio2Face-3D#
Audio2Face-3D NIM 公开了以下 gRPC
用于处理音频数据和获取动画数据的双向流式 gRPC 或 2 个单向流式端点。
用于获取微服务当前配置的 Unary gRPC。
如果您过去使用过 Audio2face-3D,请参阅 从 1.0 迁移到 1.2 页面。
双向流式 gRPC#
在之前的 Audio2Face 版本中,需要一个单独的服务来实现这个双向端点,称为 Audio2Face Controller。我们保留了 service.a2f_controller
和 nvidia_ace.controller
本节描述了与 Audio2Face-3D 双向端点交互的过程。
Audio2Face-3D 双向流式服务在此 proto 中描述。proto 中的 ProcessAudioStream rpc 是您需要调用的唯一调用,用于从音频输入生成动画数据。
syntax = "proto3";
import "nvidia_ace.controller.v1.proto";
import "nvidia_ace.animation_id.v1.proto";
import "google/protobuf/empty.proto";
service A2FControllerService {
// Will process a single audio clip and answer animation data
// in a burst.
rpc ProcessAudioStream(stream nvidia_ace.controller.v1.AudioStream)
returns (stream nvidia_ace.controller.v1.AnimationDataStream) {}
服务 protobuf 对象#
在流式传输完所有音频数据后,客户端必须发送空的 AudioStream.EndOfAudio
以向服务发出信号,表明所有音频数据都已设置。只有在此之后,服务才会返回 gRPC 状态。如果在输入流式传输期间出现任何错误,服务将立即返回 gRPC 错误状态。
syntax = "proto3";
package nvidia_ace.controller.v1;
import "nvidia_ace.a2f.v1.proto";
import "nvidia_ace.animation_data.v1.proto";
import "";
import "nvidia_ace.status.v1.proto";
import "google/protobuf/any.proto";
message AudioStream {
// This is a marker for the end of an audio clip.
message EndOfAudio {}
oneof stream_part {
// The header must be sent as the first message.
AudioStreamHeader audio_stream_header = 1;
// At least one AudioWithEmotion messages must be sent thereafter.
nvidia_ace.a2f.v1.AudioWithEmotion audio_with_emotion = 2;
// The EndOfAudio must be sent last.
EndOfAudio end_of_audio = 3;
// IMPORTANT NOTE: this is an AudioStreamHeader WITHOUT ID
// A similar AudioStreamHeader exist in nvidia_ace.a2f.v1.proto
// but that one does contain IDs.
message AudioStreamHeader {
// Metadata about the audio being sent to the service. audio_header = 1;
// Parameters for updating the facial characteristics of an avatar.
// See the documentation for more information.
nvidia_ace.a2f.v1.FaceParameters face_params = 2;
// Parameters relative to the emotion blending and processing
// before using it to generate blendshapes.
// See the documentation for more information.
nvidia_ace.a2f.v1.EmotionPostProcessingParameters emotion_post_processing_params = 3;
// Multipliers and offsets to apply to the generated blendshape values.
nvidia_ace.a2f.v1.BlendShapeParameters blendshape_params = 4;
// Emotion parameters (live transition time, beginning emotion)
nvidia_ace.a2f.v1.EmotionParameters emotion_params = 5;
enum EventType {
// This event type means that the A2F Microservice is done processing audio,
// However it doesn't mean that you finished receiving all the audio data,
// You will receive a Status message once you are done receiving all the audio
// data. Events are independent of that.
message Event {
// Type of the event.
EventType event_type = 1;
// Data attached to the event if any.
optional google.protobuf.Any metadata = 2;
// IMPORTANT NOTE: this is an AnimationDataStreamHeader WITHOUT ID
// A similar AudioStreamHeader exist in nvidia_ace.animation_data.v1.proto
// but that one does contain IDs.
message AnimationDataStreamHeader {
// Metadata of the audio buffers. This defines the audio clip properties
// at the beginning the streaming process.
optional audio_header = 1;
// Metadata containing the blendshape and joints names.
// This defines the names of the blendshapes and joints flowing through a stream.
optional nvidia_ace.animation_data.v1.SkelAnimationHeader
skel_animation_header = 2;
// Time codes indicate the relative progression of an animation data, audio
// clip, etc. The unit is seconds. In addition, we also need an absolute time
// reference shared across services. The start time is stored in time codes
// elapsed since the Unix time epoch. start_time_code_since_epoch = `Unix
// timestamp in seconds`. NTP should be good enough to synchronize clocks
// across nodes. From Wikipedia: NTP can usually maintain time to within tens
// of milliseconds over the public Internet, and can achieve better than one
// millisecond accuracy in local area networks under ideal conditions.
// Alternatively, there is PTP.
double start_time_code_since_epoch = 3;
// A generic metadata field to attach use case specific data (e.g. session id,
// or user id?) map<string, string> metadata = 4; map<string,
// google.protobuf.Any> metadata = 4;
message AnimationDataStream {
// The header must be sent as the first message.
// One or more animation data message must be sent.
// The status must be sent last and may be sent in between.
oneof stream_part {
// The header must be sent as the first message.
AnimationDataStreamHeader animation_data_stream_header = 1;
// Then one or more animation data message must be sent.
nvidia_ace.animation_data.v1.AnimationData animation_data = 2;
// The event may be sent in between.
Event event = 3;
// The status must be sent last and may be sent in between.
nvidia_ace.status.v1.Status status = 4;
配置获取 gRPC#
此服务处于 alpha 版本。
下面 proto 中的 GetConfigs rpc 是您需要调用的唯一调用,用于获取 Audio2Face-3D 微服务的当前配置。有关其更多信息,请参阅 A2F-3D NIM 手动容器部署和配置 页面。
syntax = "proto3";
// This proto is in alpha version and might be subject to future changes
service A2XExportConfigService {
rpc GetConfigs(ConfigsTypeRequest) returns (stream A2XConfig) {}
message ConfigsTypeRequest {
enum ConfigType {
YAML = 0; // YAML should be chosen for updating the A2F MS
JSON = 1;
ConfigType config_type = 1;
message A2XConfig {
// E.g.:
// contains claire_inference.yaml
string name = 1;
// File Content
string content = 2;
// v0.1.0%
单向流式 gRPC#
单向流式 gRPC
本节描述了 Audio2Face-3D 微服务在传统模式下运行时,与 2 个单向流式端点通信的情况。为了与 Audio2Face-3D 交互,您需要创建一个客户端来发送数据,并实现一个服务器来接收数据。
客户端 - 服务
这是您需要将数据发送到的 gRPC 服务器原型
syntax = "proto3";
import "nvidia_ace.a2f.v1.proto";
import "nvidia_ace.status.v1.proto";
service A2FService {
// RPC to implement to send audio data to Audio2Face Microservice
// An example use for this RPC is a client pushing audio buffers to
// Audio2Face Microservice (server)
rpc PushAudioStream(stream nvidia_ace.a2f.v1.AudioStream)
returns (nvidia_ace.status.v1.Status) {}
客户端 - Protobuf 数据
