A2F-3D NIM 手动容器部署和配置#

我们通过 NGC 注册表提供 Docker 容器以用于部署目的。本指南将演示如何部署、配置和运行 Audio2Face-3D NIM 的 Docker 镜像。

在继续之前，请务必先阅读架构概述页面，熟悉运行 Audio2Face-3D 所需的概念、服务和要求。

Audio2Face-3D 可以通过配置文件和环境变量进行高度配置。要配置 Audio2Face-3D，您需要使用自定义入口点。

先决条件#

为了运行微服务，您需要访问 NGC Docker 注册表。

请确保您拥有NVAIE 访问权限，您的个人密钥，并且您已登录到nvcr.io 注册表。

您还需要配置了 Docker 的 NVIDIA 容器工具包。

有关硬件和软件要求的更多信息，请访问支持矩阵页面。

配置文件#

A2F-3D 配置文件有 3 种。这些配置文件中的每一个都对应于特定类型的用户。

艺术家：风格化配置包含仅艺术家会调整的特定配置参数。
Devops：部署配置包含仅 Devops 需要考虑的特定配置参数。
高级用户：其余配置参数不太常更新。但对于特定场景，拥有它们是必要的。

警告

这些配置文件是部署时配置文件。虽然它们看起来与运行时配置文件相似，但不应将运行时和部署时配置文件混淆。这 2 个配置文件之间存在大小写差异（snake_case VS camelCase）和结构差异。这是一个运行时配置示例：config_james.yml。

1. 风格化配置文件#

此配置文件有 3 种变体

Claire
James
Mark

它们各自对应于特定的 AI 模型，并包含将要使用的默认值。默认情况下，微服务使用 James 的配置。

Claire 配置#

claire_stylization_config.yaml

# These are the default emotions applied at the beginning of any audio clip, and it also defines the default preferred emotion.
# Their values range from 0.0 to 1.0
default_beginning_emotions:
  amazement: 0.0
  anger: 0.0
  cheekiness: 0.0
  disgust: 0.0
  fear: 0.0
  grief: 0.0
  joy: 0.0
  outofbreath: 0.0
  pain: 0.0
  sadness: 0.0

a2e:
  enabled: true # Enable audio2emotion, ai-generated audio-driven emotion
  live_transition_time: 0.5 # Controls the smoothness of the output transition toward the target value across frames; higher values result in smoother transitions. Each frame updates at a rate of <frame time length> / <live transition time> (capped at 1.0) toward the raw result.
  post_processing_params:
    emotion_contrast: 1.0 # Increases the spread between emotion values by pushing them higher or lower
    emotion_strength: 0.6 # Sets the strength of generated emotions relative to neutral emotion
    enable_preferred_emotion: true # Activate blending preferred emotion with auto-emotion
    live_blend_coef: 0.7 # Coefficient for exponential smoothing of emotion
    max_emotions: 3 # Sets a firm limit on the quantity of emotion sliders engaged by A2E - emotions with the highest weight will be prioritized
    preferred_emotion_strength: 0.5 # Sets the strength of the preferred emotion (if is loaded) relative to generated emotions

a2f:
  # A2F model, can be one of james_v2.3, claire_v2.3 or mark_v2.3
  inference_model_id: claire_v2.3
  blendshape_id: claire_topo1_v2.1

  face_params:
    eyelid_offset: 0.0 # Adjusts the default pose of eyelid open-close
    face_mask_level: 0.6 # Determines the boundary between the upper and lower regions of the face
    face_mask_softness: 0.0085 # Determines how smoothly the upper and lower face regions blend on the boundary
    input_strength: 1.0 # Controls the magnitude of the input audio
    lip_close_offset: 0.0 # Adjusts the default pose of lip close-open
    lower_face_smoothing: 0.006 # Applies temporal smoothing to the lower face motion
    lower_face_strength: 1.25 # Controls the range of motion on the lower regions of the face
    skin_strength: 1.0 # Controls the range of motion of the skin
    upper_face_smoothing: 0.001 # Applies temporal smoothing to the upper face motion
    upper_face_strength: 1.0 # Controls the range of motion on the upper regions of the face

  blendshape_params: # Modulates the effect of each blendshapes. Gain * w + offset
    enable_clamping_bs_weight: false

    # Multiplier for each blendshape output. This list depends on the blendshape model.
    weight_multipliers:
      EyeBlinkLeft: 1.0
      EyeLookDownLeft: 1.0
      EyeLookInLeft: 1.0
      EyeLookOutLeft: 1.0
      EyeLookUpLeft: 1.0
      EyeSquintLeft: 1.0
      EyeWideLeft: 1.0
      EyeBlinkRight: 1.0
      EyeLookDownRight: 1.0
      EyeLookInRight: 1.0
      EyeLookOutRight: 1.0
      EyeLookUpRight: 1.0
      EyeSquintRight: 1.0
      EyeWideRight: 1.0
      JawForward: 1.0
      JawLeft: 1.0
      JawRight: 1.0
      JawOpen: 1.0
      MouthClose: 1.0
      MouthFunnel: 1.0
      MouthPucker: 1.0
      MouthLeft: 1.0
      MouthRight: 1.0
      MouthSmileLeft: 1.0
      MouthSmileRight: 1.0
      MouthFrownLeft: 1.0
      MouthFrownRight: 1.0
      MouthDimpleLeft: 1.0
      MouthDimpleRight: 1.0
      MouthStretchLeft: 1.0
      MouthStretchRight: 1.0
      MouthRollLower: 1.0
      MouthRollUpper: 1.0
      MouthShrugLower: 1.0
      MouthShrugUpper: 1.0
      MouthPressLeft: 1.0
      MouthPressRight: 1.0
      MouthLowerDownLeft: 1.0
      MouthLowerDownRight: 1.0
      MouthUpperUpLeft: 1.0
      MouthUpperUpRight: 1.0
      BrowDownLeft: 1.0
      BrowDownRight: 1.0
      BrowInnerUp: 1.0
      BrowOuterUpLeft: 1.0
      BrowOuterUpRight: 1.0
      CheekPuff: 1.0
      CheekSquintLeft: 1.0
      CheekSquintRight: 1.0
      NoseSneerLeft: 1.0
      NoseSneerRight: 1.0
      TongueOut: 1.0

    # Constant offset for each blendshape output. This list depends on the blendshape model.
    weight_offsets:
      EyeBlinkLeft: 0.0
      EyeLookDownLeft: 0.0
      EyeLookInLeft: 0.0
      EyeLookOutLeft: 0.0
      EyeLookUpLeft: 0.0
      EyeSquintLeft: 0.0
      EyeWideLeft: 0.0
      EyeBlinkRight: 0.0
      EyeLookDownRight: 0.0
      EyeLookInRight: 0.0
      EyeLookOutRight: 0.0
      EyeLookUpRight: 0.0
      EyeSquintRight: 0.0
      EyeWideRight: 0.0
      JawForward: 0.0
      JawLeft: 0.0
      JawRight: 0.0
      JawOpen: 0.0
      MouthClose: 0.0
      MouthFunnel: 0.0
      MouthPucker: 0.0
      MouthLeft: 0.0
      MouthRight: 0.0
      MouthSmileLeft: 0.0
      MouthSmileRight: 0.0
      MouthFrownLeft: 0.0
      MouthFrownRight: 0.0
      MouthDimpleLeft: 0.0
      MouthDimpleRight: 0.0
      MouthStretchLeft: 0.0
      MouthStretchRight: 0.0
      MouthRollLower: 0.0
      MouthRollUpper: 0.0
      MouthShrugLower: 0.0
      MouthShrugUpper: 0.0
      MouthPressLeft: 0.0
      MouthPressRight: 0.0
      MouthLowerDownLeft: 0.0
      MouthLowerDownRight: 0.0
      MouthUpperUpLeft: 0.0
      MouthUpperUpRight: 0.0
      BrowDownLeft: 0.0
      BrowDownRight: 0.0
      BrowInnerUp: 0.0
      BrowOuterUpLeft: 0.0
      BrowOuterUpRight: 0.0
      CheekPuff: 0.0
      CheekSquintLeft: 0.0
      CheekSquintRight: 0.0
      NoseSneerLeft: 0.0
      NoseSneerRight: 0.0
      TongueOut: 0.0

James 配置#

james_stylization_config.yaml

# These are the default emotions applied at the beginning of any audio clip.
# Their values range from 0.0 to 1.0
default_beginning_emotions:
  amazement: 0.0
  anger: 0.0
  cheekiness: 0.0
  disgust: 0.0
  fear: 0.0
  grief: 0.0
  joy: 0.0
  outofbreath: 0.0
  pain: 0.0
  sadness: 0.0

a2e:
  enabled: true
  live_transition_time: 0.5
  post_processing_params:
    emotion_contrast: 1.0 # Increases the spread between emotion values by pushing them higher or lower
    emotion_strength: 0.6 # Sets the strength of generated emotions relative to neutral emotion
    enable_preferred_emotion: true # Activate blending preferred emotion with auto-emotion
    live_blend_coef: 0.7 # Coefficient for exponential smoothing of emotion
    max_emotions: 3 # Sets a firm limit on the quantity of emotion sliders engaged by A2E - emotions with the highest weight will be prioritized
    preferred_emotion_strength: 0.5 # Sets the strength of the preferred emotion (if is loaded) relative to generated emotions

a2f:
  # A2F model, can be one of james_v2.3, claire_v2.3 or mark_v2.3
  inference_model_id: james_v2.3
  blendshape_id: james_topo2_v2.2

  face_params:
    eyelid_offset: 0.06 # Adjusts the default pose of eyelid open-close
    face_mask_level: 0.6 # Determines the boundary between the upper and lower regions of the face
    face_mask_softness: 0.0085 # Determines how smoothly the upper and lower face regions blend on the boundary
    input_strength: 1.0 # Controls the magnitude of the input audio
    lip_close_offset: -0.02 # Adjusts the default pose of lip close-open
    lower_face_smoothing: 0.006 # Applies temporal smoothing to the lower face motion
    lower_face_strength: 1.2 # Controls the range of motion on the lower regions of the face
    skin_strength: 1.0 # Controls the range of motion of the skin
    upper_face_smoothing: 0.001 # Applies temporal smoothing to the upper face motion
    upper_face_strength: 1.0 # Controls the range of motion on the upper regions of the face

  blendshape_params: # Modulates the effect of each blendshapes. Gain * w + offset
    enable_clamping_bs_weight: false

    weight_multipliers:
      EyeBlinkLeft: 1.0
      EyeLookDownLeft: 1.0
      EyeLookInLeft: 1.0
      EyeLookOutLeft: 1.0
      EyeLookUpLeft: 1.0
      EyeSquintLeft: 1.0
      EyeWideLeft: 1.0
      EyeBlinkRight: 1.0
      EyeLookDownRight: 1.0
      EyeLookInRight: 1.0
      EyeLookOutRight: 1.0
      EyeLookUpRight: 1.0
      EyeSquintRight: 1.0
      EyeWideRight: 1.0
      JawForward: 1.0
      JawLeft: 1.0
      JawRight: 1.0
      JawOpen: 1.0
      MouthClose: 1.0
      MouthFunnel: 1.0
      MouthPucker: 1.0
      MouthLeft: 1.0
      MouthRight: 1.0
      MouthSmileLeft: 1.0
      MouthSmileRight: 1.0
      MouthFrownLeft: 1.0
      MouthFrownRight: 1.0
      MouthDimpleLeft: 1.0
      MouthDimpleRight: 1.0
      MouthStretchLeft: 1.0
      MouthStretchRight: 1.0
      MouthRollLower: 1.0
      MouthRollUpper: 1.0
      MouthShrugLower: 1.0
      MouthShrugUpper: 1.0
      MouthPressLeft: 1.0
      MouthPressRight: 1.0
      MouthLowerDownLeft: 1.0
      MouthLowerDownRight: 1.0
      MouthUpperUpLeft: 1.0
      MouthUpperUpRight: 1.0
      BrowDownLeft: 1.0
      BrowDownRight: 1.0
      BrowInnerUp: 1.0
      BrowOuterUpLeft: 1.0
      BrowOuterUpRight: 1.0
      CheekPuff: 1.0
      CheekSquintLeft: 1.0
      CheekSquintRight: 1.0
      NoseSneerLeft: 1.0
      NoseSneerRight: 1.0
      TongueOut: 1.0

    weight_offsets:
      EyeBlinkLeft: 0.0
      EyeLookDownLeft: 0.0
      EyeLookInLeft: 0.0
      EyeLookOutLeft: 0.0
      EyeLookUpLeft: 0.0
      EyeSquintLeft: 0.0
      EyeWideLeft: 0.0
      EyeBlinkRight: 0.0
      EyeLookDownRight: 0.0
      EyeLookInRight: 0.0
      EyeLookOutRight: 0.0
      EyeLookUpRight: 0.0
      EyeSquintRight: 0.0
      EyeWideRight: 0.0
      JawForward: 0.0
      JawLeft: 0.0
      JawRight: 0.0
      JawOpen: 0.0
      MouthClose: 0.0
      MouthFunnel: 0.0
      MouthPucker: 0.0
      MouthLeft: 0.0
      MouthRight: 0.0
      MouthSmileLeft: 0.0
      MouthSmileRight: 0.0
      MouthFrownLeft: 0.0
      MouthFrownRight: 0.0
      MouthDimpleLeft: 0.0
      MouthDimpleRight: 0.0
      MouthStretchLeft: 0.0
      MouthStretchRight: 0.0
      MouthRollLower: 0.0
      MouthRollUpper: 0.0
      MouthShrugLower: 0.0
      MouthShrugUpper: 0.0
      MouthPressLeft: 0.0
      MouthPressRight: 0.0
      MouthLowerDownLeft: 0.0
      MouthLowerDownRight: 0.0
      MouthUpperUpLeft: 0.0
      MouthUpperUpRight: 0.0
      BrowDownLeft: 0.0
      BrowDownRight: 0.0
      BrowInnerUp: 0.0
      BrowOuterUpLeft: 0.0
      BrowOuterUpRight: 0.0
      CheekPuff: 0.0
      CheekSquintLeft: 0.0
      CheekSquintRight: 0.0
      NoseSneerLeft: 0.0
      NoseSneerRight: 0.0
      TongueOut: 0.0

Mark 配置#

mark_stylization_config.yaml

# These are the default emotions applied at the beginning of any audio clip.
# Their values range from 0.0 to 1.0
default_beginning_emotions:
  amazement: 0.0
  anger: 0.0
  cheekiness: 0.0
  disgust: 0.0
  fear: 0.0
  grief: 0.0
  joy: 0.0
  outofbreath: 0.0
  pain: 0.0
  sadness: 0.0

a2e:
  enabled: true
  live_transition_time: 0.5
  post_processing_params:
    emotion_contrast: 1.0 # Increases the spread between emotion values by pushing them higher or lower
    emotion_strength: 0.6 # Sets the strength of generated emotions relative to neutral emotion
    enable_preferred_emotion: true # Activate blending preferred emotion with auto-emotion
    live_blend_coef: 0.7 # Coefficient for exponential smoothing of emotion
    max_emotions: 3 # Sets a firm limit on the quantity of emotion sliders engaged by A2E - emotions with the highest weight will be prioritized
    preferred_emotion_strength: 0.5 # Sets the strength of the preferred emotion (if is loaded) relative to generated emotions

a2f:
  # A2F model, can be one of james_v2.3, claire_v2.3 or mark_v2.3
  inference_model_id: mark_v2.3
  blendshape_id: mark_topo1_v2.1

  face_params:
    eyelid_offset: 0.06 # Adjusts the default pose of eyelid open-close
    face_mask_level: 0.6 # Determines the boundary between the upper and lower regions of the face
    face_mask_softness: 0.0085 # Determines how smoothly the upper and lower face regions blend on the boundary
    input_strength: 1.3 # Controls the magnitude of the input audio
    lip_close_offset: -0.03 # Adjusts the default pose of lip close-open
    lower_face_smoothing: 0.0023 # Applies temporal smoothing to the lower face motion
    lower_face_strength: 1.4 # Controls the range of motion on the lower regions of the face
    skin_strength: 1.1 # Controls the range of motion of the skin
    upper_face_smoothing: 0.001 # Applies temporal smoothing to the upper face motion
    upper_face_strength: 1.0 # Controls the range of motion on the upper regions of the face

  blendshape_params: # Modulates the effect of each blendshapes. Gain * w + offset
    enable_clamping_bs_weight: false

    weight_multipliers:
      EyeBlinkLeft: 1.0
      EyeLookDownLeft: 1.0
      EyeLookInLeft: 1.0
      EyeLookOutLeft: 1.0
      EyeLookUpLeft: 1.0
      EyeSquintLeft: 1.0
      EyeWideLeft: 1.0
      EyeBlinkRight: 1.0
      EyeLookDownRight: 1.0
      EyeLookInRight: 1.0
      EyeLookOutRight: 1.0
      EyeLookUpRight: 1.0
      EyeSquintRight: 1.0
      EyeWideRight: 1.0
      JawForward: 1.0
      JawLeft: 1.0
      JawRight: 1.0
      JawOpen: 1.0
      MouthClose: 1.0
      MouthFunnel: 1.0
      MouthPucker: 1.0
      MouthLeft: 1.0
      MouthRight: 1.0
      MouthSmileLeft: 1.0
      MouthSmileRight: 1.0
      MouthFrownLeft: 1.0
      MouthFrownRight: 1.0
      MouthDimpleLeft: 1.0
      MouthDimpleRight: 1.0
      MouthStretchLeft: 1.0
      MouthStretchRight: 1.0
      MouthRollLower: 1.0
      MouthRollUpper: 1.0
      MouthShrugLower: 1.0
      MouthShrugUpper: 1.0
      MouthPressLeft: 1.0
      MouthPressRight: 1.0
      MouthLowerDownLeft: 1.0
      MouthLowerDownRight: 1.0
      MouthUpperUpLeft: 1.0
      MouthUpperUpRight: 1.0
      BrowDownLeft: 1.0
      BrowDownRight: 1.0
      BrowInnerUp: 1.0
      BrowOuterUpLeft: 1.0
      BrowOuterUpRight: 1.0
      CheekPuff: 1.0
      CheekSquintLeft: 1.0
      CheekSquintRight: 1.0
      NoseSneerLeft: 1.0
      NoseSneerRight: 1.0
      TongueOut: 1.0

    weight_offsets:
      EyeBlinkLeft: 0.0
      EyeLookDownLeft: 0.0
      EyeLookInLeft: 0.0
      EyeLookOutLeft: 0.0
      EyeLookUpLeft: 0.0
      EyeSquintLeft: 0.0
      EyeWideLeft: 0.0
      EyeBlinkRight: 0.0
      EyeLookDownRight: 0.0
      EyeLookInRight: 0.0
      EyeLookOutRight: 0.0
      EyeLookUpRight: 0.0
      EyeSquintRight: 0.0
      EyeWideRight: 0.0
      JawForward: 0.0
      JawLeft: 0.0
      JawRight: 0.0
      JawOpen: 0.0
      MouthClose: 0.0
      MouthFunnel: 0.0
      MouthPucker: 0.0
      MouthLeft: 0.0
      MouthRight: 0.0
      MouthSmileLeft: 0.0
      MouthSmileRight: 0.0
      MouthFrownLeft: 0.0
      MouthFrownRight: 0.0
      MouthDimpleLeft: 0.0
      MouthDimpleRight: 0.0
      MouthStretchLeft: 0.0
      MouthStretchRight: 0.0
      MouthRollLower: 0.0
      MouthRollUpper: 0.0
      MouthShrugLower: 0.0
      MouthShrugUpper: 0.0
      MouthPressLeft: 0.0
      MouthPressRight: 0.0
      MouthLowerDownLeft: 0.0
      MouthLowerDownRight: 0.0
      MouthUpperUpLeft: 0.0
      MouthUpperUpRight: 0.0
      BrowDownLeft: 0.0
      BrowDownRight: 0.0
      BrowInnerUp: 0.0
      BrowOuterUpLeft: 0.0
      BrowOuterUpRight: 0.0
      CheekPuff: 0.0
      CheekSquintLeft: 0.0
      CheekSquintRight: 0.0
      NoseSneerLeft: 0.0
      NoseSneerRight: 0.0
      TongueOut: 0.0

2. 部署配置文件#

3. 高级配置文件#

advanced_config.yaml

input_sanitization:
  # max size of UUID
  max_len_uuid: 50
  # Maximum samplerate
  max_sample_rate: 144000
  # Minimum samplerate
  min_sample_rate: 16000
  # Maximum amount in second for the processing time
  # After this timeout the connection to A2F will be cut
  max_processing_duration_second: 300
  # Maximum size of 1 audio buffer sent over the grpc stream
  max_audio_buffer_size_second: 10
  # Maximum size of the audio clip to process
  max_audio_clip_size_second: 300
  # Maximum amount of time that A2F Controller will wait when not
  # receiving data from A2F, before cutting the connection
  max_wait_time_idle_ms: 30000
  # Will stop serving a user if their fps a lower than low_fps
  # for more than low_fps_max_duration_second seconds
  # For real time application less than 30 FPS means slower than realtime
  # So if users provide audio to the service at less than 30 FPS then
  # the interactive experience will stutter.
  low_fps: 29
  low_fps_max_duration_second: 7

garbage_collector:
  # enable or disable the garbage collector
  # This is only used with bidirectional connection where the service is holding data
  # waiting for the client to pick them up.
  enabled: true
  # how often the garbage collector should run
  interval_run_second: 10
  # If the garbage collector finds streams holding
  # more than N seconds of data, it will delete data
  # until the amount falls below this threshold.
  # Clients are expected to retrieve data promptly so that
  # the service doesn't retain the data excessively.
  max_size_stored_data_second: 60


pipeline_parameters:
  # Queues between pipeline components
  # Can be tweaked:
  # Higher values can lead to higher throughput but leads to higher latencies
  # Lower values leads to lower latencies; and potentially lower overall throughput
  # Leave these values to default in case of doubt
  queue_size_after_a2e: 1
  queue_size_after_a2f: 300
  queue_size_after_streammux: 1

  streammux:
    # Do not change this config; this is internal
    adaptive_batching: 0
    # Minimum FPS for all streams
    # Pipeline will not slow down under this value if:
    # * compute allows it
    # * upload speed of audio allows it
    # Here 40 FPS
    # Numerator for that config:
    overall_min_fps_n: 40
    # Denominator for that config:
    overall_min_fps_d: 1

a2f:
  # Remove temporal smoothing
  # used for debugging individual frames generated
  temporal_smoothing: true
  device_id: 0 # Which gpu id to use

a2e:
  inference_interval: 10
  device_id: 0 # Which gpu id to use


trt_model_generation:
  a2e:
    precision: "fp16"
    min_shape: 1
    optimal_shape: 10
    maximum_shape: 128
  a2f:
    precision: "fp16"
    min_shape: 1
    optimal_shape: 10
    maximum_shape: 128

以上配置文件表示微服务正在使用的默认值。

要应用您自己的配置，请使用自定义端点启动 A2F-3D NIM，并将您的配置文件挂载到容器内部。有关详细说明，请参阅本指南的下一节。

如何使用配置文件#

要覆盖任何配置，您需要将文件挂载到 /mnt/configs 路径中的 Docker 卷中。为了方便起见，导出环境变量，其中包含被覆盖的配置所在的路径，名为 LOCAL_CONFIGS。要执行此操作，您可以按照以下示例说明进行操作

$ mkdir -p ~/.cache/audio2face-3d-configs
$ export LOCAL_CONFIGS=~/.cache/audio2face-3d-configs

您需要复制上述配置，并将它们放置在 LOCAL_CONFIGS 目录中。

然后您将拥有

$ ls $LOCAL_CONFIGS
advanced_config.yaml
claire_stylization_config.yaml
deployment_config.yaml
james_stylization_config.yaml
mark_stylization_config.yaml

模型缓存#

您可以将模型缓存在本地，这样下次运行服务时就不必生成或下载它。要缓存模型，请使用 Docker 卷挂载。确保本地路径具有执行、读取和写入权限（777 权限）。您可以按照这些说明设置本地路径，以使用正确的权限缓存模型

$ mkdir -p ~/.cache/audio2face-3d
$ chmod 777 ~/.cache/audio2face-3d
$ export LOCAL_NIM_CACHE=~/.cache/audio2face-3d

当您第二次运行时，如果您想运行 NIM 入口点而不是自定义入口点，请确保通过设置环境变量 NIM_DISABLE_MODEL_DOWNLOAD=true 来禁用模型缓存。

使用自定义入口点启动 A2F-3D NIM#

$ docker run -it --rm --name audio2face-3d \
 --gpus all \
 --network=host \
 --entrypoint /bin/bash -w /opt/nvidia/a2f_pipeline \
 -e NIM_DISABLE_MODEL_DOWNLOAD=true \
 -e NIM_SKIP_A2F_START=true \
 -v "$LOCAL_NIM_CACHE:/tmp/a2x" \
 -v "$LOCAL_CONFIGS:/mnt/configs/" \
 nvcr.io/nim/nvidia/audio2face-3d:1.2

以上命令将创建一个运行 Audio2face-3D NIM 的 Docker 容器。请注意，指定了 --gpus all 以将 GPU 桥接到 Docker 容器。您可以根据自己的喜好自定义此选项。

我们还使用 --network=host 来绑定本地网络上的所有端口。如果您想精细控制端口绑定，请使用 -p 指令以及适当的端口。

请注意用于缓存模型的两个卷挂载 -v "$LOCAL_NIM_CACHE:/tmp/a2x" 和用于覆盖配置的 -v "$LOCAL_CONFIGS:/mnt/configs/"。如果您不想更改默认配置或分别缓存模型，请跳过每个卷挂载。

然后，您应该会进入一个 shell

triton-server@host-name:/opt/nvidia/a2f_pipeline$

在容器内部，通过运行以下命令启动 NIM 服务器

$ /opt/nim/start_server.sh &

& 运算符将服务器作为后台进程启动，使您能够在容器内运行其他命令。如果您没有立即返回到 shell 提示符，请按 Enter 键以重新获得访问权限并继续执行命令。

以下命令在容器内部运行，除非另有说明。

生成 TRT 引擎#

第一步是使用提供的 python 应用程序为 AI 模型（特定于您的机器使用的 GPU）生成 TRT 引擎，如下所示

usage: generate_trt_models.py [-h] [--stylization-config STYLIZATION_CONFIG] [--advanced-config ADVANCED_CONFIG]

Generates TRT models for A2F Service.

options:
  -h, --help            show this help message and exit
  --stylization-config  STYLIZATION_CONFIG
                        file path to the stylization config
  --advanced-config     ADVANCED_CONFIG
                        file path to the advanced config

如果您想坚持使用这些默认值，则无需指定任何内容。

注意

您可以备份生成的 TRT 引擎，以跳过 NIM 启动时的模型生成，但请注意，每个模型都特定于每个 GPU。生成的模型位于 Docker 容器内的 /tmp/a2x 目录中。

使用默认配置生成 Audio2Emotion 和 Audio2Face TRT 引擎

$ ./service/generate_trt_models.py

当部署环境发生更改时，需要重新生成此 TRT 引擎。当出现 GPU 更改时尤其如此，包括不同的架构或计算能力。生成的 TRT 引擎有可能在具有完全相同的受控配置（相同的硬件 + docker）的机器上重复使用。建议在每次硬件更改时都重新生成 TRT 引擎。

启动服务#

第二步是启动服务。Audio2Face-3D 服务帮助菜单如下所示

$ a2f_pipeline.run -h
Usage: a2f_pipeline.run [--help] [--version] [--stylization-config] [--deployment-config] [--advanced-config]

Optional arguments:
  -h, --help                     shows help message and exits
  -v, --version                  prints version information and exits
  --stylization-config           file path to the stylization config
  --deployment-config            file path to the deployment config
  --advanced-config              file path to the advanced config

要使用默认配置，您只需在容器内运行

$ /usr/local/bin/a2f_pipeline.run

您应该看到如下日志，表示 A2F-3D 服务已正确启动。

[2024-04-23 12:44:33.066] [  global  ] [info] Running...

注意

启动服务时，您可能会遇到标记为 GStreamer-WARNING 的警告。出现这些警告是因为容器中缺少某些库。但是，这些警告可以安全地忽略，因为 Audio2Face-3D 不使用这些库。

更改配置 - 最短路径#

以下命令在容器内部运行。

假设您决定使用 claire 模型，您可以运行以下命令

$ ./service/generate_trt_models.py --stylization-config /mnt/configs/claire_stylization_config.yaml \
   --advanced-config /mnt/configs/advanced_config.yaml
$ a2f_pipeline.run --stylization-config /mnt/configs/claire_stylization_config.yaml \
                   --deployment-config /mnt/configs/deployment_config.yaml \
                   --advanced-config /mnt/configs/advanced_config.yaml

警告

当前的 ./service/generate_trt_models.py 不支持缓存失效。如果您更新了配置文件并想要重新生成模型，则需要删除 /tmp/a2x/ 缓存文件夹中相应的 TRT 模型

然后，您将拥有一个运行容器，其中包含自定义提供的参数。

更改配置 - 灵活方式#

指定配置文件的工作方式是通过覆盖值。这意味着您不必在配置文件中指定默认值。因此，您的配置文件只需要包含默认配置文件的一部分子集。

此外，对于风格化配置的 a2f-3d 部分；指定 inference_model_id 将自动加载与该 id 匹配的默认面部参数；指定 blendshape_id 将自动加载默认的 blendshape 参数。

一个例子可以说明这一点，应该会使事情非常清楚

示例 1：设置风格化配置以使用 Mark#

在主机上，在 $LOCAL_CONFIGS 目录中创建一个名为 short_mark_stylization_config.yaml 的文件，并添加以下行

a2f:
  inference_model_id: mark_v2.3
  blendshape_id: mark_topo1_v2.1

然后，在容器内部，运行

$ ./service/generate_trt_models.py --stylization-config  /mnt/configs/short_mark_stylization_config.yaml
$ a2f_pipeline.run --stylization-config /mnt/configs/short_mark_stylization_config.yaml

警告

当前的 ./service/generate_trt_models.py 不支持缓存失效。如果您更新了配置文件并想要重新生成模型，则需要删除 /tmp/a2x/ 缓存文件夹中相应的 TRT 模型

此命令与提供完整的默认 Mark 配置文件具有完全相同的效果。原因是，在底层，inference_model_id 和 blendshape_id 用于加载这些默认值。

示例 2：将端点类型更新为单向#

在这里，我们将讨论 deployment_config.yaml 的设置部分。

在主机上，在 $LOCAL_CONFIGS 目录中创建一个名为 unidirectional_deployment_config.yaml 的文件，并添加以下行

endpoints:
  use_bidirectional: false

然后，在容器内部，运行

$ ./service/generate_trt_models.py
$ a2f_pipeline.run --deployment-config /mnt/configs/unidirectional_deployment_config.yaml

这会将端点类型从双向覆盖为单向。

此方法适用于提供的 yaml 文件的任何键。

警告

请确保使用与您的配置文件匹配的选项。

--stylization-config # for the <any>_stylization_config.yaml
--deployment-config # for the deployment_config.yaml
--advanced-config # for the advanced_config.yaml

高级风格化#

上述风格化配置 blendshape 调整已为新用户简化。

对于高级用户，以下部分提供了更多信息。

高级 Blendshape 调整

可以为 blendshape 调整设置 3 个更多参数

active_poses：哪些 Blendshape 应该处于活动状态。1 表示活动；0 表示非活动
cancel_poses：哪些 Blendshape 相互抵消；匹配的数字表示哪个与哪个匹配；-1 表示无操作
symmetry_poses：哪个 Blendshape 与另一个对称；匹配的数字表示哪个与哪个匹配；-1 表示无操作

claire_stylization_config.yaml

# These are the default emotions applied at the beginning of any audio clip.
# Their values range from 0.0 to 1.0
default_beginning_emotions:
  amazement: 0.0
  anger: 0.0
  cheekiness: 0.0
  disgust: 0.0
  fear: 0.0
  grief: 0.0
  joy: 0.0
  outofbreath: 0.0
  pain: 0.0
  sadness: 0.0

a2e:
  enabled: true
  live_transition_time: 0.5
  post_processing_params:
    emotion_contrast: 1.0 # Increases the spread between emotion values by pushing them higher or lower
    emotion_strength: 0.6 # Sets the strength of generated emotions relative to neutral emotion
    enable_preferred_emotion: true # Activate blending preferred emotion with auto-emotion
    live_blend_coef: 0.7 # Coefficient for exponential smoothing of emotion
    max_emotions: 3 # Sets a firm limit on the quantity of emotion sliders engaged by A2E - emotions with the highest weight will be prioritized
    preferred_emotion_strength: 0.5 # Sets the strength of the preferred emotion (if is loaded) relative to generated emotions

a2f:
  # A2F model, can be one of james_v2.3, claire_v2.3 or mark_v2.3
  inference_model_id: claire_v2.3
  blendshape_id: claire_topo1_v2.1

  face_params:
    eyelid_offset: 0.0 # Adjusts the default pose of eyelid open-close
    face_mask_level: 0.6 # Determines the boundary between the upper and lower regions of the face
    face_mask_softness: 0.0085 # Determines how smoothly the upper and lower face regions blend on the boundary
    input_strength: 1.0 # Controls the magnitude of the input audio
    lip_close_offset: 0.0 # Adjusts the default pose of lip close-open
    lower_face_smoothing: 0.006 # Applies temporal smoothing to the lower face motion
    lower_face_strength: 1.25 # Controls the range of motion on the lower regions of the face
    skin_strength: 1.0 # Controls the range of motion of the skin
    upper_face_smoothing: 0.001 # Applies temporal smoothing to the upper face motion
    upper_face_strength: 1.0 # Controls the range of motion on the upper regions of the face

  blendshape_params: # Modulates the effect of each blendshapes. Gain * w + offset
    enable_clamping_bs_weight: false

    weight_multipliers:
      EyeBlinkLeft: 1.0
      EyeLookDownLeft: 1.0
      EyeLookInLeft: 1.0
      EyeLookOutLeft: 1.0
      EyeLookUpLeft: 1.0
      EyeSquintLeft: 1.0
      EyeWideLeft: 1.0
      EyeBlinkRight: 1.0
      EyeLookDownRight: 1.0
      EyeLookInRight: 1.0
      EyeLookOutRight: 1.0
      EyeLookUpRight: 1.0
      EyeSquintRight: 1.0
      EyeWideRight: 1.0
      JawForward: 1.0
      JawLeft: 1.0
      JawRight: 1.0
      JawOpen: 1.0
      MouthClose: 1.0
      MouthFunnel: 1.0
      MouthPucker: 1.0
      MouthLeft: 1.0
      MouthRight: 1.0
      MouthSmileLeft: 1.0
      MouthSmileRight: 1.0
      MouthFrownLeft: 1.0
      MouthFrownRight: 1.0
      MouthDimpleLeft: 1.0
      MouthDimpleRight: 1.0
      MouthStretchLeft: 1.0
      MouthStretchRight: 1.0
      MouthRollLower: 1.0
      MouthRollUpper: 1.0
      MouthShrugLower: 1.0
      MouthShrugUpper: 1.0
      MouthPressLeft: 1.0
      MouthPressRight: 1.0
      MouthLowerDownLeft: 1.0
      MouthLowerDownRight: 1.0
      MouthUpperUpLeft: 1.0
      MouthUpperUpRight: 1.0
      BrowDownLeft: 1.0
      BrowDownRight: 1.0
      BrowInnerUp: 1.0
      BrowOuterUpLeft: 1.0
      BrowOuterUpRight: 1.0
      CheekPuff: 1.0
      CheekSquintLeft: 1.0
      CheekSquintRight: 1.0
      NoseSneerLeft: 1.0
      NoseSneerRight: 1.0
      TongueOut: 1.0

    weight_offsets:
      EyeBlinkLeft: 0.0
      EyeLookDownLeft: 0.0
      EyeLookInLeft: 0.0
      EyeLookOutLeft: 0.0
      EyeLookUpLeft: 0.0
      EyeSquintLeft: 0.0
      EyeWideLeft: 0.0
      EyeBlinkRight: 0.0
      EyeLookDownRight: 0.0
      EyeLookInRight: 0.0
      EyeLookOutRight: 0.0
      EyeLookUpRight: 0.0
      EyeSquintRight: 0.0
      EyeWideRight: 0.0
      JawForward: 0.0
      JawLeft: 0.0
      JawRight: 0.0
      JawOpen: 0.0
      MouthClose: 0.0
      MouthFunnel: 0.0
      MouthPucker: 0.0
      MouthLeft: 0.0
      MouthRight: 0.0
      MouthSmileLeft: 0.0
      MouthSmileRight: 0.0
      MouthFrownLeft: 0.0
      MouthFrownRight: 0.0
      MouthDimpleLeft: 0.0
      MouthDimpleRight: 0.0
      MouthStretchLeft: 0.0
      MouthStretchRight: 0.0
      MouthRollLower: 0.0
      MouthRollUpper: 0.0
      MouthShrugLower: 0.0
      MouthShrugUpper: 0.0
      MouthPressLeft: 0.0
      MouthPressRight: 0.0
      MouthLowerDownLeft: 0.0
      MouthLowerDownRight: 0.0
      MouthUpperUpLeft: 0.0
      MouthUpperUpRight: 0.0
      BrowDownLeft: 0.0
      BrowDownRight: 0.0
      BrowInnerUp: 0.0
      BrowOuterUpLeft: 0.0
      BrowOuterUpRight: 0.0
      CheekPuff: 0.0
      CheekSquintLeft: 0.0
      CheekSquintRight: 0.0
      NoseSneerLeft: 0.0
      NoseSneerRight: 0.0
      TongueOut: 0.0

    active_poses:
      EyeBlinkLeft: 1
      EyeLookDownLeft: 0
      EyeLookInLeft: 0
      EyeLookOutLeft: 0
      EyeLookUpLeft: 0
      EyeSquintLeft: 1
      EyeWideLeft: 1
      EyeBlinkRight: 1
      EyeLookDownRight: 0
      EyeLookInRight: 0
      EyeLookOutRight: 0
      EyeLookUpRight: 0
      EyeSquintRight: 1
      EyeWideRight: 1
      JawForward: 1
      JawLeft: 1
      JawRight: 1
      JawOpen: 1
      MouthClose: 1
      MouthFunnel: 1
      MouthPucker: 1
      MouthLeft: 1
      MouthRight: 1
      MouthSmileLeft: 1
      MouthSmileRight: 1
      MouthFrownLeft: 1
      MouthFrownRight: 1
      MouthDimpleLeft: 1
      MouthDimpleRight: 1
      MouthStretchLeft: 1
      MouthStretchRight: 1
      MouthRollLower: 1
      MouthRollUpper: 1
      MouthShrugLower: 1
      MouthShrugUpper: 1
      MouthPressLeft: 1
      MouthPressRight: 1
      MouthLowerDownLeft: 1
      MouthLowerDownRight: 1
      MouthUpperUpLeft: 1
      MouthUpperUpRight: 1
      BrowDownLeft: 1
      BrowDownRight: 1
      BrowInnerUp: 1
      BrowOuterUpLeft: 1
      BrowOuterUpRight: 1
      CheekPuff: 1
      CheekSquintLeft: 1
      CheekSquintRight: 1
      NoseSneerLeft: 1
      NoseSneerRight: 1
      TongueOut: 0

    cancel_poses:
      EyeBlinkLeft: -1
      EyeLookDownLeft: -1
      EyeLookInLeft: -1
      EyeLookOutLeft: -1
      EyeLookUpLeft: -1
      EyeSquintLeft: -1
      EyeWideLeft: -1
      EyeBlinkRight: -1
      EyeLookDownRight: -1
      EyeLookInRight: -1
      EyeLookOutRight: -1
      EyeLookUpRight: -1
      EyeSquintRight: -1
      EyeWideRight: -1
      JawForward: -1
      JawLeft: -1
      JawRight: -1
      JawOpen: -1
      MouthClose: -1
      MouthFunnel: -1
      MouthPucker: -1
      MouthLeft: -1
      MouthRight: -1
      MouthSmileLeft: -1
      MouthSmileRight: -1
      MouthFrownLeft: -1
      MouthFrownRight: -1
      MouthDimpleLeft: -1
      MouthDimpleRight: -1
      MouthStretchLeft: -1
      MouthStretchRight: -1
      MouthRollLower: -1
      MouthRollUpper: -1
      MouthShrugLower: -1
      MouthShrugUpper: -1
      MouthPressLeft: -1
      MouthPressRight: -1
      MouthLowerDownLeft: -1
      MouthLowerDownRight: -1
      MouthUpperUpLeft: -1
      MouthUpperUpRight: -1
      BrowDownLeft: -1
      BrowDownRight: -1
      BrowInnerUp: -1
      BrowOuterUpLeft: -1
      BrowOuterUpRight: -1
      CheekPuff: -1
      CheekSquintLeft: -1
      CheekSquintRight: -1
      NoseSneerLeft: -1
      NoseSneerRight: -1
      TongueOut: -1

    symmetry_poses:
      EyeBlinkLeft: 0
      EyeLookDownLeft: -1
      EyeLookInLeft: -1
      EyeLookOutLeft: -1
      EyeLookUpLeft: -1
      EyeSquintLeft: -1
      EyeWideLeft: 1
      EyeBlinkRight: 0
      EyeLookDownRight: -1
      EyeLookInRight: -1
      EyeLookOutRight: -1
      EyeLookUpRight: -1
      EyeSquintRight: -1
      EyeWideRight: 1
      JawForward: -1
      JawLeft: -1
      JawRight: -1
      JawOpen: -1
      MouthClose: -1
      MouthFunnel: -1
      MouthPucker: -1
      MouthLeft: -1
      MouthRight: -1
      MouthSmileLeft: 2
      MouthSmileRight: 2
      MouthFrownLeft: 3
      MouthFrownRight: 3
      MouthDimpleLeft: 4
      MouthDimpleRight: 4
      MouthStretchLeft: 5
      MouthStretchRight: 5
      MouthRollLower: -1
      MouthRollUpper: -1
      MouthShrugLower: -1
      MouthShrugUpper: -1
      MouthPressLeft: 6
      MouthPressRight: 6
      MouthLowerDownLeft: 7
      MouthLowerDownRight: 7
      MouthUpperUpLeft: 8
      MouthUpperUpRight: 8
      BrowDownLeft: 10
      BrowDownRight: 10
      BrowInnerUp: -1
      BrowOuterUpLeft: 9
      BrowOuterUpRight: 9
      CheekPuff: -1
      CheekSquintLeft: 11
      CheekSquintRight: 11
      NoseSneerLeft: 12
      NoseSneerRight: 12
      TongueOut: -1

james_stylization_config.yaml

# These are the default emotions applied at the beginning of any audio clip.
# Their values range from 0.0 to 1.0
default_beginning_emotions:
  amazement: 0.0
  anger: 0.0
  cheekiness: 0.0
  disgust: 0.0
  fear: 0.0
  grief: 0.0
  joy: 0.0
  outofbreath: 0.0
  pain: 0.0
  sadness: 0.0

a2e:
  enabled: true
  live_transition_time: 0.5
  post_processing_params:
    emotion_contrast: 1.0 # Increases the spread between emotion values by pushing them higher or lower
    emotion_strength: 0.6 # Sets the strength of generated emotions relative to neutral emotion
    enable_preferred_emotion: true # Activate blending preferred emotion with auto-emotion
    live_blend_coef: 0.7 # Coefficient for exponential smoothing of emotion
    max_emotions: 3 # Sets a firm limit on the quantity of emotion sliders engaged by A2E - emotions with the highest weight will be prioritized
    preferred_emotion_strength: 0.5 # Sets the strength of the preferred emotion (if is loaded) relative to generated emotions

a2f:
  # A2F model, can be one of james_v2.3, claire_v2.3 or mark_v2.3
  inference_model_id: james_v2.3
  blendshape_id: james_topo2_v2.2

  face_params:
    eyelid_offset: 0.06 # Adjusts the default pose of eyelid open-close
    face_mask_level: 0.6 # Determines the boundary between the upper and lower regions of the face
    face_mask_softness: 0.0085 # Determines how smoothly the upper and lower face regions blend on the boundary
    input_strength: 1.0 # Controls the magnitude of the input audio
    lip_close_offset: -0.02 # Adjusts the default pose of lip close-open
    lower_face_smoothing: 0.006 # Applies temporal smoothing to the lower face motion
    lower_face_strength: 1.2 # Controls the range of motion on the lower regions of the face
    skin_strength: 1.0 # Controls the range of motion of the skin
    upper_face_smoothing: 0.001 # Applies temporal smoothing to the upper face motion
    upper_face_strength: 1.0 # Controls the range of motion on the upper regions of the face

  blendshape_params: # Modulates the effect of each blendshapes. Gain * w + offset
    enable_clamping_bs_weight: false

    weight_multipliers:
      EyeBlinkLeft: 1.0
      EyeLookDownLeft: 1.0
      EyeLookInLeft: 1.0
      EyeLookOutLeft: 1.0
      EyeLookUpLeft: 1.0
      EyeSquintLeft: 1.0
      EyeWideLeft: 1.0
      EyeBlinkRight: 1.0
      EyeLookDownRight: 1.0
      EyeLookInRight: 1.0
      EyeLookOutRight: 1.0
      EyeLookUpRight: 1.0
      EyeSquintRight: 1.0
      EyeWideRight: 1.0
      JawForward: 1.0
      JawLeft: 1.0
      JawRight: 1.0
      JawOpen: 1.0
      MouthClose: 1.0
      MouthFunnel: 1.0
      MouthPucker: 1.0
      MouthLeft: 1.0
      MouthRight: 1.0
      MouthSmileLeft: 1.0
      MouthSmileRight: 1.0
      MouthFrownLeft: 1.0
      MouthFrownRight: 1.0
      MouthDimpleLeft: 1.0
      MouthDimpleRight: 1.0
      MouthStretchLeft: 1.0
      MouthStretchRight: 1.0
      MouthRollLower: 1.0
      MouthRollUpper: 1.0
      MouthShrugLower: 1.0
      MouthShrugUpper: 1.0
      MouthPressLeft: 1.0
      MouthPressRight: 1.0
      MouthLowerDownLeft: 1.0
      MouthLowerDownRight: 1.0
      MouthUpperUpLeft: 1.0
      MouthUpperUpRight: 1.0
      BrowDownLeft: 1.0
      BrowDownRight: 1.0
      BrowInnerUp: 1.0
      BrowOuterUpLeft: 1.0
      BrowOuterUpRight: 1.0
      CheekPuff: 1.0
      CheekSquintLeft: 1.0
      CheekSquintRight: 1.0
      NoseSneerLeft: 1.0
      NoseSneerRight: 1.0
      TongueOut: 1.0

    weight_offsets:
      EyeBlinkLeft: 0.0
      EyeLookDownLeft: 0.0
      EyeLookInLeft: 0.0
      EyeLookOutLeft: 0.0
      EyeLookUpLeft: 0.0
      EyeSquintLeft: 0.0
      EyeWideLeft: 0.0
      EyeBlinkRight: 0.0
      EyeLookDownRight: 0.0
      EyeLookInRight: 0.0
      EyeLookOutRight: 0.0
      EyeLookUpRight: 0.0
      EyeSquintRight: 0.0
      EyeWideRight: 0.0
      JawForward: 0.0
      JawLeft: 0.0
      JawRight: 0.0
      JawOpen: 0.0
      MouthClose: 0.0
      MouthFunnel: 0.0
      MouthPucker: 0.0
      MouthLeft: 0.0
      MouthRight: 0.0
      MouthSmileLeft: 0.0
      MouthSmileRight: 0.0
      MouthFrownLeft: 0.0
      MouthFrownRight: 0.0
      MouthDimpleLeft: 0.0
      MouthDimpleRight: 0.0
      MouthStretchLeft: 0.0
      MouthStretchRight: 0.0
      MouthRollLower: 0.0
      MouthRollUpper: 0.0
      MouthShrugLower: 0.0
      MouthShrugUpper: 0.0
      MouthPressLeft: 0.0
      MouthPressRight: 0.0
      MouthLowerDownLeft: 0.0
      MouthLowerDownRight: 0.0
      MouthUpperUpLeft: 0.0
      MouthUpperUpRight: 0.0
      BrowDownLeft: 0.0
      BrowDownRight: 0.0
      BrowInnerUp: 0.0
      BrowOuterUpLeft: 0.0
      BrowOuterUpRight: 0.0
      CheekPuff: 0.0
      CheekSquintLeft: 0.0
      CheekSquintRight: 0.0
      NoseSneerLeft: 0.0
      NoseSneerRight: 0.0
      TongueOut: 0.0

    active_poses:
      EyeBlinkLeft: 1
      EyeLookDownLeft: 0
      EyeLookInLeft: 0
      EyeLookOutLeft: 0
      EyeLookUpLeft: 0
      EyeSquintLeft: 1
      EyeWideLeft: 1
      EyeBlinkRight: 1
      EyeLookDownRight: 0
      EyeLookInRight: 0
      EyeLookOutRight: 0
      EyeLookUpRight: 0
      EyeSquintRight: 1
      EyeWideRight: 1
      JawForward: 1
      JawLeft: 1
      JawRight: 1
      JawOpen: 1
      MouthClose: 1
      MouthFunnel: 1
      MouthPucker: 1
      MouthLeft: 1
      MouthRight: 1
      MouthSmileLeft: 1
      MouthSmileRight: 1
      MouthFrownLeft: 1
      MouthFrownRight: 1
      MouthDimpleLeft: 1
      MouthDimpleRight: 1
      MouthStretchLeft: 1
      MouthStretchRight: 1
      MouthRollLower: 1
      MouthRollUpper: 1
      MouthShrugLower: 1
      MouthShrugUpper: 1
      MouthPressLeft: 1
      MouthPressRight: 1
      MouthLowerDownLeft: 1
      MouthLowerDownRight: 1
      MouthUpperUpLeft: 1
      MouthUpperUpRight: 1
      BrowDownLeft: 1
      BrowDownRight: 1
      BrowInnerUp: 1
      BrowOuterUpLeft: 1
      BrowOuterUpRight: 1
      CheekPuff: 1
      CheekSquintLeft: 1
      CheekSquintRight: 1
      NoseSneerLeft: 1
      NoseSneerRight: 1
      TongueOut: 0

    cancel_poses:
      EyeBlinkLeft: -1
      EyeLookDownLeft: -1
      EyeLookInLeft: -1
      EyeLookOutLeft: -1
      EyeLookUpLeft: -1
      EyeSquintLeft: -1
      EyeWideLeft: -1
      EyeBlinkRight: -1
      EyeLookDownRight: -1
      EyeLookInRight: -1
      EyeLookOutRight: -1
      EyeLookUpRight: -1
      EyeSquintRight: -1
      EyeWideRight: -1
      JawForward: -1
      JawLeft: -1
      JawRight: -1
      JawOpen: -1
      MouthClose: -1
      MouthFunnel: -1
      MouthPucker: -1
      MouthLeft: -1
      MouthRight: -1
      MouthSmileLeft: -1
      MouthSmileRight: -1
      MouthFrownLeft: -1
      MouthFrownRight: -1
      MouthDimpleLeft: -1
      MouthDimpleRight: -1
      MouthStretchLeft: -1
      MouthStretchRight: -1
      MouthRollLower: -1
      MouthRollUpper: -1
      MouthShrugLower: -1
      MouthShrugUpper: -1
      MouthPressLeft: -1
      MouthPressRight: -1
      MouthLowerDownLeft: -1
      MouthLowerDownRight: -1
      MouthUpperUpLeft: -1
      MouthUpperUpRight: -1
      BrowDownLeft: -1
      BrowDownRight: -1
      BrowInnerUp: -1
      BrowOuterUpLeft: -1
      BrowOuterUpRight: -1
      CheekPuff: -1
      CheekSquintLeft: -1
      CheekSquintRight: -1
      NoseSneerLeft: -1
      NoseSneerRight: -1
      TongueOut: -1

    symmetry_poses:
      EyeBlinkLeft: 0
      EyeLookDownLeft: -1
      EyeLookInLeft: -1
      EyeLookOutLeft: -1
      EyeLookUpLeft: -1
      EyeSquintLeft: -1
      EyeWideLeft: 1
      EyeBlinkRight: 0
      EyeLookDownRight: -1
      EyeLookInRight: -1
      EyeLookOutRight: -1
      EyeLookUpRight: -1
      EyeSquintRight: -1
      EyeWideRight: 1
      JawForward: -1
      JawLeft: -1
      JawRight: -1
      JawOpen: -1
      MouthClose: -1
      MouthFunnel: -1
      MouthPucker: -1
      MouthLeft: -1
      MouthRight: -1
      MouthSmileLeft: 2
      MouthSmileRight: 2
      MouthFrownLeft: 3
      MouthFrownRight: 3
      MouthDimpleLeft: 4
      MouthDimpleRight: 4
      MouthStretchLeft: 5
      MouthStretchRight: 5
      MouthRollLower: -1
      MouthRollUpper: -1
      MouthShrugLower: -1
      MouthShrugUpper: -1
      MouthPressLeft: 6
      MouthPressRight: 6
      MouthLowerDownLeft: 7
      MouthLowerDownRight: 7
      MouthUpperUpLeft: 8
      MouthUpperUpRight: 8
      BrowDownLeft: 10
      BrowDownRight: 10
      BrowInnerUp: -1
      BrowOuterUpLeft: 9
      BrowOuterUpRight: 9
      CheekPuff: -1
      CheekSquintLeft: 11
      CheekSquintRight: 11
      NoseSneerLeft: 12
      NoseSneerRight: 12
      TongueOut: -1

mark_stylization_config.yaml

# These are the default emotions applied at the beginning of any audio clip.
# Their values range from 0.0 to 1.0
default_beginning_emotions:
  amazement: 0.0
  anger: 0.0
  cheekiness: 0.0
  disgust: 0.0
  fear: 0.0
  grief: 0.0
  joy: 0.0
  outofbreath: 0.0
  pain: 0.0
  sadness: 0.0

a2e:
  enabled: true
  live_transition_time: 0.5
  post_processing_params:
    emotion_contrast: 1.0 # Increases the spread between emotion values by pushing them higher or lower
    emotion_strength: 0.6 # Sets the strength of generated emotions relative to neutral emotion
    enable_preferred_emotion: true # Activate blending preferred emotion with auto-emotion
    live_blend_coef: 0.7 # Coefficient for exponential smoothing of emotion
    max_emotions: 3 # Sets a firm limit on the quantity of emotion sliders engaged by A2E - emotions with the highest weight will be prioritized
    preferred_emotion_strength: 0.5 # Sets the strength of the preferred emotion (if is loaded) relative to generated emotions

a2f:
  # A2F model, can be one of james_v2.3, claire_v2.3 or mark_v2.3
  inference_model_id: mark_v2.3
  blendshape_id: mark_topo1_v2.1

  face_params:
    eyelid_offset: 0.06 # Adjusts the default pose of eyelid open-close
    face_mask_level: 0.6 # Determines the boundary between the upper and lower regions of the face
    face_mask_softness: 0.0085 # Determines how smoothly the upper and lower face regions blend on the boundary
    input_strength: 1.3 # Controls the magnitude of the input audio
    lip_close_offset: -0.03 # Adjusts the default pose of lip close-open
    lower_face_smoothing: 0.0023 # Applies temporal smoothing to the lower face motion
    lower_face_strength: 1.4 # Controls the range of motion on the lower regions of the face
    skin_strength: 1.1 # Controls the range of motion of the skin
    upper_face_smoothing: 0.001 # Applies temporal smoothing to the upper face motion
    upper_face_strength: 1.0 # Controls the range of motion on the upper regions of the face

  blendshape_params: # Modulates the effect of each blendshapes. Gain * w + offset
    enable_clamping_bs_weight: false

    weight_multipliers:
      EyeBlinkLeft: 1.0
      EyeLookDownLeft: 1.0
      EyeLookInLeft: 1.0
      EyeLookOutLeft: 1.0
      EyeLookUpLeft: 1.0
      EyeSquintLeft: 1.0
      EyeWideLeft: 1.0
      EyeBlinkRight: 1.0
      EyeLookDownRight: 1.0
      EyeLookInRight: 1.0
      EyeLookOutRight: 1.0
      EyeLookUpRight: 1.0
      EyeSquintRight: 1.0
      EyeWideRight: 1.0
      JawForward: 1.0
      JawLeft: 1.0
      JawRight: 1.0
      JawOpen: 1.0
      MouthClose: 1.0
      MouthFunnel: 1.0
      MouthPucker: 1.0
      MouthLeft: 1.0
      MouthRight: 1.0
      MouthSmileLeft: 1.0
      MouthSmileRight: 1.0
      MouthFrownLeft: 1.0
      MouthFrownRight: 1.0
      MouthDimpleLeft: 1.0
      MouthDimpleRight: 1.0
      MouthStretchLeft: 1.0
      MouthStretchRight: 1.0
      MouthRollLower: 1.0
      MouthRollUpper: 1.0
      MouthShrugLower: 1.0
      MouthShrugUpper: 1.0
      MouthPressLeft: 1.0
      MouthPressRight: 1.0
      MouthLowerDownLeft: 1.0
      MouthLowerDownRight: 1.0
      MouthUpperUpLeft: 1.0
      MouthUpperUpRight: 1.0
      BrowDownLeft: 1.0
      BrowDownRight: 1.0
      BrowInnerUp: 1.0
      BrowOuterUpLeft: 1.0
      BrowOuterUpRight: 1.0
      CheekPuff: 1.0
      CheekSquintLeft: 1.0
      CheekSquintRight: 1.0
      NoseSneerLeft: 1.0
      NoseSneerRight: 1.0
      TongueOut: 1.0

    weight_offsets:
      EyeBlinkLeft: 0.0
      EyeLookDownLeft: 0.0
      EyeLookInLeft: 0.0
      EyeLookOutLeft: 0.0
      EyeLookUpLeft: 0.0
      EyeSquintLeft: 0.0
      EyeWideLeft: 0.0
      EyeBlinkRight: 0.0
      EyeLookDownRight: 0.0
      EyeLookInRight: 0.0
      EyeLookOutRight: 0.0
      EyeLookUpRight: 0.0
      EyeSquintRight: 0.0
      EyeWideRight: 0.0
      JawForward: 0.0
      JawLeft: 0.0
      JawRight: 0.0
      JawOpen: 0.0
      MouthClose: 0.0
      MouthFunnel: 0.0
      MouthPucker: 0.0
      MouthLeft: 0.0
      MouthRight: 0.0
      MouthSmileLeft: 0.0
      MouthSmileRight: 0.0
      MouthFrownLeft: 0.0
      MouthFrownRight: 0.0
      MouthDimpleLeft: 0.0
      MouthDimpleRight: 0.0
      MouthStretchLeft: 0.0
      MouthStretchRight: 0.0
      MouthRollLower: 0.0
      MouthRollUpper: 0.0
      MouthShrugLower: 0.0
      MouthShrugUpper: 0.0
      MouthPressLeft: 0.0
      MouthPressRight: 0.0
      MouthLowerDownLeft: 0.0
      MouthLowerDownRight: 0.0
      MouthUpperUpLeft: 0.0
      MouthUpperUpRight: 0.0
      BrowDownLeft: 0.0
      BrowDownRight: 0.0
      BrowInnerUp: 0.0
      BrowOuterUpLeft: 0.0
      BrowOuterUpRight: 0.0
      CheekPuff: 0.0
      CheekSquintLeft: 0.0
      CheekSquintRight: 0.0
      NoseSneerLeft: 0.0
      NoseSneerRight: 0.0
      TongueOut: 0.0

    active_poses:
      EyeBlinkLeft: 1
      EyeLookDownLeft: 0
      EyeLookInLeft: 0
      EyeLookOutLeft: 0
      EyeLookUpLeft: 0
      EyeSquintLeft: 1
      EyeWideLeft: 1
      EyeBlinkRight: 1
      EyeLookDownRight: 0
      EyeLookInRight: 0
      EyeLookOutRight: 0
      EyeLookUpRight: 0
      EyeSquintRight: 1
      EyeWideRight: 1
      JawForward: 1
      JawLeft: 1
      JawRight: 1
      JawOpen: 1
      MouthClose: 1
      MouthFunnel: 1
      MouthPucker: 1
      MouthLeft: 1
      MouthRight: 1
      MouthSmileLeft: 1
      MouthSmileRight: 1
      MouthFrownLeft: 1
      MouthFrownRight: 1
      MouthDimpleLeft: 1
      MouthDimpleRight: 1
      MouthStretchLeft: 1
      MouthStretchRight: 1
      MouthRollLower: 1
      MouthRollUpper: 1
      MouthShrugLower: 1
      MouthShrugUpper: 1
      MouthPressLeft: 1
      MouthPressRight: 1
      MouthLowerDownLeft: 1
      MouthLowerDownRight: 1
      MouthUpperUpLeft: 1
      MouthUpperUpRight: 1
      BrowDownLeft: 1
      BrowDownRight: 1
      BrowInnerUp: 1
      BrowOuterUpLeft: 1
      BrowOuterUpRight: 1
      CheekPuff: 1
      CheekSquintLeft: 1
      CheekSquintRight: 1
      NoseSneerLeft: 1
      NoseSneerRight: 1
      TongueOut: 0

    cancel_poses:
      EyeBlinkLeft: -1
      EyeLookDownLeft: -1
      EyeLookInLeft: -1
      EyeLookOutLeft: -1
      EyeLookUpLeft: -1
      EyeSquintLeft: -1
      EyeWideLeft: -1
      EyeBlinkRight: -1
      EyeLookDownRight: -1
      EyeLookInRight: -1
      EyeLookOutRight: -1
      EyeLookUpRight: -1
      EyeSquintRight: -1
      EyeWideRight: -1
      JawForward: -1
      JawLeft: -1
      JawRight: -1
      JawOpen: -1
      MouthClose: -1
      MouthFunnel: -1
      MouthPucker: -1
      MouthLeft: -1
      MouthRight: -1
      MouthSmileLeft: -1
      MouthSmileRight: -1
      MouthFrownLeft: -1
      MouthFrownRight: -1
      MouthDimpleLeft: -1
      MouthDimpleRight: -1
      MouthStretchLeft: -1
      MouthStretchRight: -1
      MouthRollLower: -1
      MouthRollUpper: -1
      MouthShrugLower: -1
      MouthShrugUpper: -1
      MouthPressLeft: -1
      MouthPressRight: -1
      MouthLowerDownLeft: -1
      MouthLowerDownRight: -1
      MouthUpperUpLeft: -1
      MouthUpperUpRight: -1
      BrowDownLeft: -1
      BrowDownRight: -1
      BrowInnerUp: -1
      BrowOuterUpLeft: -1
      BrowOuterUpRight: -1
      CheekPuff: -1
      CheekSquintLeft: -1
      CheekSquintRight: -1
      NoseSneerLeft: -1
      NoseSneerRight: -1
      TongueOut: -1

    symmetry_poses:
      EyeBlinkLeft: 0
      EyeLookDownLeft: -1
      EyeLookInLeft: -1
      EyeLookOutLeft: -1
      EyeLookUpLeft: -1
      EyeSquintLeft: -1
      EyeWideLeft: 1
      EyeBlinkRight: 0
      EyeLookDownRight: -1
      EyeLookInRight: -1
      EyeLookOutRight: -1
      EyeLookUpRight: -1
      EyeSquintRight: -1
      EyeWideRight: 1
      JawForward: -1
      JawLeft: -1
      JawRight: -1
      JawOpen: -1
      MouthClose: -1
      MouthFunnel: -1
      MouthPucker: -1
      MouthLeft: -1
      MouthRight: -1
      MouthSmileLeft: 2
      MouthSmileRight: 2
      MouthFrownLeft: 3
      MouthFrownRight: 3
      MouthDimpleLeft: 4
      MouthDimpleRight: 4
      MouthStretchLeft: 5
      MouthStretchRight: 5
      MouthRollLower: -1
      MouthRollUpper: -1
      MouthShrugLower: -1
      MouthShrugUpper: -1
      MouthPressLeft: 6
      MouthPressRight: 6
      MouthLowerDownLeft: 7
      MouthLowerDownRight: 7
      MouthUpperUpLeft: 8
      MouthUpperUpRight: 8
      BrowDownLeft: 10
      BrowDownRight: 10
      BrowInnerUp: -1
      BrowOuterUpLeft: 9
      BrowOuterUpRight: 9
      CheekPuff: -1
      CheekSquintLeft: 11
      CheekSquintRight: 11
      NoseSneerLeft: 12
      NoseSneerRight: 12
      TongueOut: -1

Unreal Engine Metahuman 的配置文件#

如果您计划将 A2F-3D 与 MetaHuman 角色连接，那么您将需要使用为它们适配的配置文件。与默认配置文件相比，这些配置文件的唯一更改是 blendshape 乘数和偏移量

MetaHuman 风格化配置文件

claire_stylization_config.yaml

# These are the default emotions applied at the beginning of any audio clip.
# Their values range from 0.0 to 1.0
default_beginning_emotions:
  amazement: 0.0
  anger: 0.0
  cheekiness: 0.0
  disgust: 0.0
  fear: 0.0
  grief: 0.0
  joy: 0.0
  outofbreath: 0.0
  pain: 0.0
  sadness: 0.0

a2e:
  enabled: true
  live_transition_time: 0.5
  post_processing_params:
    emotion_contrast: 1.0 # Increases the spread between emotion values by pushing them higher or lower
    emotion_strength: 0.6 # Sets the strength of generated emotions relative to neutral emotion
    enable_preferred_emotion: true # Activate blending preferred emotion with auto-emotion
    live_blend_coef: 0.7 # Coefficient for exponential smoothing of emotion
    max_emotions: 3 # Sets a firm limit on the quantity of emotion sliders engaged by A2E - emotions with the highest weight will be prioritized
    preferred_emotion_strength: 0.5 # Sets the strength of the preferred emotion (if is loaded) relative to generated emotions

a2f:
  # A2F model, can be one of james_v2.3, claire_v2.3 or mark_v2.3
  inference_model_id: claire_v2.3
  blendshape_id: claire_topo1_v2.1

  face_params:
    eyelid_offset: 0.0 # Adjusts the default pose of eyelid open-close
    face_mask_level: 0.6 # Determines the boundary between the upper and lower regions of the face
    face_mask_softness: 0.0085 # Determines how smoothly the upper and lower face regions blend on the boundary
    input_strength: 1.0 # Controls the magnitude of the input audio
    lip_close_offset: 0.0 # Adjusts the default pose of lip close-open
    lower_face_smoothing: 0.006 # Applies temporal smoothing to the lower face motion
    lower_face_strength: 1.25 # Controls the range of motion on the lower regions of the face
    skin_strength: 1.0 # Controls the range of motion of the skin
    upper_face_smoothing: 0.001 # Applies temporal smoothing to the upper face motion
    upper_face_strength: 1.0 # Controls the range of motion on the upper regions of the face

  blendshape_params: # Modulates the effect of each blendshapes. Gain * w + offset
    enable_clamping_bs_weight: false

    weight_multipliers:
      EyeBlinkLeft: 1.0
      EyeLookDownLeft: 0.0
      EyeLookInLeft: 0.0
      EyeLookOutLeft: 0.0
      EyeLookUpLeft: 0.0
      EyeSquintLeft: 1.0
      EyeWideLeft: 1.0
      EyeBlinkRight: 1.0
      EyeLookDownRight: 0.0
      EyeLookInRight: 0.0
      EyeLookOutRight: 0.0
      EyeLookUpRight: 0.0
      EyeSquintRight: 1.0
      EyeWideRight: 1.0
      JawForward: 0.7
      JawLeft: 0.2
      JawRight: 0.2
      JawOpen: 1.0
      MouthClose: 1.0
      MouthFunnel: 1.2
      MouthPucker: 1.2
      MouthLeft: 0.2
      MouthRight: 0.2
      MouthSmileLeft: 0.8
      MouthSmileRight: 0.8
      MouthFrownLeft: 0.4
      MouthFrownRight: 0.4
      MouthDimpleLeft: 0.7
      MouthDimpleRight: 0.7
      MouthStretchLeft: 0.1
      MouthStretchRight: 0.1
      MouthRollLower: 0.9
      MouthRollUpper: 0.5
      MouthShrugLower: 0.9
      MouthShrugUpper: 0.4
      MouthPressLeft: 0.8
      MouthPressRight: 0.8
      MouthLowerDownLeft: 0.8
      MouthLowerDownRight: 0.8
      MouthUpperUpLeft: 0.8
      MouthUpperUpRight: 0.8
      BrowDownLeft: 1.0
      BrowDownRight: 1.0
      BrowInnerUp: 1.0
      BrowOuterUpLeft: 1.0
      BrowOuterUpRight: 1.0
      CheekPuff: 0.2
      CheekSquintLeft: 1.0
      CheekSquintRight: 1.0
      NoseSneerLeft: 0.8
      NoseSneerRight: 0.8
      TongueOut: 0.0

    weight_offsets:  # Modulates the effect of each blendshapes. blendshape_values * weight_multipliers + weight_offsets
      EyeBlinkLeft: 0.0
      EyeLookDownLeft: 0.0
      EyeLookInLeft: 0.0
      EyeLookOutLeft: 0.0
      EyeLookUpLeft: 0.0
      EyeSquintLeft: 0.0
      EyeWideLeft: 0.0
      EyeBlinkRight: 0.0
      EyeLookDownRight: 0.0
      EyeLookInRight: 0.0
      EyeLookOutRight: 0.0
      EyeLookUpRight: 0.0
      EyeSquintRight: 0.0
      EyeWideRight: 0.0
      JawForward: 0.0
      JawLeft: 0.0
      JawRight: 0.0
      JawOpen: 0.0
      MouthClose: 0.0
      MouthFunnel: 0.0
      MouthPucker: 0.0
      MouthLeft: 0.0
      MouthRight: 0.0
      MouthSmileLeft: 0.0
      MouthSmileRight: 0.0
      MouthFrownLeft: 0.0
      MouthFrownRight: 0.0
      MouthDimpleLeft: 0.0
      MouthDimpleRight: 0.0
      MouthStretchLeft: 0.0
      MouthStretchRight: 0.0
      MouthRollLower: 0.0
      MouthRollUpper: 0.0
      MouthShrugLower: 0.0
      MouthShrugUpper: 0.0
      MouthPressLeft: 0.0
      MouthPressRight: 0.0
      MouthLowerDownLeft: 0.0
      MouthLowerDownRight: 0.0
      MouthUpperUpLeft: 0.0
      MouthUpperUpRight: 0.0
      BrowDownLeft: 0.0
      BrowDownRight: 0.0
      BrowInnerUp: 0.0
      BrowOuterUpLeft: 0.0
      BrowOuterUpRight: 0.0
      CheekPuff: 0.0
      CheekSquintLeft: 0.0
      CheekSquintRight: 0.0
      NoseSneerLeft: 0.0
      NoseSneerRight: 0.0
      TongueOut: 0.0

    active_poses: # Define which poses are active and which one are not
      EyeBlinkLeft: 1
      EyeLookDownLeft: 0
      EyeLookInLeft: 0
      EyeLookOutLeft: 0
      EyeLookUpLeft: 0
      EyeSquintLeft: 1
      EyeWideLeft: 1
      EyeBlinkRight: 1
      EyeLookDownRight: 0
      EyeLookInRight: 0
      EyeLookOutRight: 0
      EyeLookUpRight: 0
      EyeSquintRight: 1
      EyeWideRight: 1
      JawForward: 1
      JawLeft: 1
      JawRight: 1
      JawOpen: 1
      MouthClose: 1
      MouthFunnel: 1
      MouthPucker: 1
      MouthLeft: 1
      MouthRight: 1
      MouthSmileLeft: 1
      MouthSmileRight: 1
      MouthFrownLeft: 1
      MouthFrownRight: 1
      MouthDimpleLeft: 1
      MouthDimpleRight: 1
      MouthStretchLeft: 1
      MouthStretchRight: 1
      MouthRollLower: 1
      MouthRollUpper: 1
      MouthShrugLower: 1
      MouthShrugUpper: 1
      MouthPressLeft: 1
      MouthPressRight: 1
      MouthLowerDownLeft: 1
      MouthLowerDownRight: 1
      MouthUpperUpLeft: 1
      MouthUpperUpRight: 1
      BrowDownLeft: 1
      BrowDownRight: 1
      BrowInnerUp: 1
      BrowOuterUpLeft: 1
      BrowOuterUpRight: 1
      CheekPuff: 1
      CheekSquintLeft: 1
      CheekSquintRight: 1
      NoseSneerLeft: 1
      NoseSneerRight: 1
      TongueOut: 0

    cancel_poses: # Define which poses cancel each other
      EyeBlinkLeft: -1
      EyeLookDownLeft: -1
      EyeLookInLeft: -1
      EyeLookOutLeft: -1
      EyeLookUpLeft: -1
      EyeSquintLeft: -1
      EyeWideLeft: -1
      EyeBlinkRight: -1
      EyeLookDownRight: -1
      EyeLookInRight: -1
      EyeLookOutRight: -1
      EyeLookUpRight: -1
      EyeSquintRight: -1
      EyeWideRight: -1
      JawForward: -1
      JawLeft: -1
      JawRight: -1
      JawOpen: -1
      MouthClose: -1
      MouthFunnel: -1
      MouthPucker: -1
      MouthLeft: -1
      MouthRight: -1
      MouthSmileLeft: -1
      MouthSmileRight: -1
      MouthFrownLeft: -1
      MouthFrownRight: -1
      MouthDimpleLeft: -1
      MouthDimpleRight: -1
      MouthStretchLeft: -1
      MouthStretchRight: -1
      MouthRollLower: -1
      MouthRollUpper: -1
      MouthShrugLower: -1
      MouthShrugUpper: -1
      MouthPressLeft: -1
      MouthPressRight: -1
      MouthLowerDownLeft: -1
      MouthLowerDownRight: -1
      MouthUpperUpLeft: -1
      MouthUpperUpRight: -1
      BrowDownLeft: -1
      BrowDownRight: -1
      BrowInnerUp: -1
      BrowOuterUpLeft: -1
      BrowOuterUpRight: -1
      CheekPuff: -1
      CheekSquintLeft: -1
      CheekSquintRight: -1
      NoseSneerLeft: -1
      NoseSneerRight: -1
      TongueOut: -1

    symmetry_poses: # Define which poses are symmetric to each other
      EyeBlinkLeft: 0
      EyeLookDownLeft: -1
      EyeLookInLeft: -1
      EyeLookOutLeft: -1
      EyeLookUpLeft: -1
      EyeSquintLeft: -1
      EyeWideLeft: 1
      EyeBlinkRight: 0
      EyeLookDownRight: -1
      EyeLookInRight: -1
      EyeLookOutRight: -1
      EyeLookUpRight: -1
      EyeSquintRight: -1
      EyeWideRight: 1
      JawForward: -1
      JawLeft: -1
      JawRight: -1
      JawOpen: -1
      MouthClose: -1
      MouthFunnel: -1
      MouthPucker: -1
      MouthLeft: -1
      MouthRight: -1
      MouthSmileLeft: 2
      MouthSmileRight: 2
      MouthFrownLeft: 3
      MouthFrownRight: 3
      MouthDimpleLeft: 4
      MouthDimpleRight: 4
      MouthStretchLeft: 5
      MouthStretchRight: 5
      MouthRollLower: -1
      MouthRollUpper: -1
      MouthShrugLower: -1
      MouthShrugUpper: -1
      MouthPressLeft: 6
      MouthPressRight: 6
      MouthLowerDownLeft: 7
      MouthLowerDownRight: 7
      MouthUpperUpLeft: 8
      MouthUpperUpRight: 8
      BrowDownLeft: 10
      BrowDownRight: 10
      BrowInnerUp: -1
      BrowOuterUpLeft: 9
      BrowOuterUpRight: 9
      CheekPuff: -1
      CheekSquintLeft: 11
      CheekSquintRight: 11
      NoseSneerLeft: 12
      NoseSneerRight: 12
      TongueOut: -1

james_stylization_config.yaml

# These are the default emotions applied at the beginning of any audio clip.
# Their values range from 0.0 to 1.0
default_beginning_emotions:
  amazement: 0.0
  anger: 0.0
  cheekiness: 0.0
  disgust: 0.0
  fear: 0.0
  grief: 0.0
  joy: 0.0
  outofbreath: 0.0
  pain: 0.0
  sadness: 0.0

a2e:
  enabled: true
  live_transition_time: 0.5
  post_processing_params:
    emotion_contrast: 1.0 # Increases the spread between emotion values by pushing them higher or lower
    emotion_strength: 0.6 # Sets the strength of generated emotions relative to neutral emotion
    enable_preferred_emotion: true # Activate blending preferred emotion with auto-emotion
    live_blend_coef: 0.7 # Coefficient for exponential smoothing of emotion
    max_emotions: 3 # Sets a firm limit on the quantity of emotion sliders engaged by A2E - emotions with the highest weight will be prioritized
    preferred_emotion_strength: 0.5 # Sets the strength of the preferred emotion (if is loaded) relative to generated emotions

a2f:
  # A2F model, can be one of james_v2.3, claire_v2.3 or mark_v2.3
  inference_model_id: james_v2.3
  blendshape_id: james_topo2_v2.2

  face_params:
    eyelid_offset: 0.06 # Adjusts the default pose of eyelid open-close
    face_mask_level: 0.6 # Determines the boundary between the upper and lower regions of the face
    face_mask_softness: 0.0085 # Determines how smoothly the upper and lower face regions blend on the boundary
    input_strength: 1.0 # Controls the magnitude of the input audio
    lip_close_offset: -0.02 # Adjusts the default pose of lip close-open
    lower_face_smoothing: 0.006 # Applies temporal smoothing to the lower face motion
    lower_face_strength: 1.2 # Controls the range of motion on the lower regions of the face
    skin_strength: 1.0 # Controls the range of motion of the skin
    upper_face_smoothing: 0.001 # Applies temporal smoothing to the upper face motion
    upper_face_strength: 1.0 # Controls the range of motion on the upper regions of the face

  blendshape_params: # Modulates the effect of each blendshapes. Gain * w + offset
    enable_clamping_bs_weight: false

    weight_multipliers:
      EyeBlinkLeft: 1.0
      EyeLookDownLeft: 0.0
      EyeLookInLeft: 0.0
      EyeLookOutLeft: 0.0
      EyeLookUpLeft: 0.0
      EyeSquintLeft: 1.0
      EyeWideLeft: 1.0
      EyeBlinkRight: 1.0
      EyeLookDownRight: 0.0
      EyeLookInRight: 0.0
      EyeLookOutRight: 0.0
      EyeLookUpRight: 0.0
      EyeSquintRight: 1.0
      EyeWideRight: 1.0
      JawForward: 0.7
      JawLeft: 0.2
      JawRight: 0.2
      JawOpen: 0.8
      MouthClose: 0.3
      MouthFunnel: 1.0
      MouthPucker: 1.0
      MouthLeft: 0.2
      MouthRight: 0.2
      MouthSmileLeft: 1.2
      MouthSmileRight: 1.2
      MouthFrownLeft: 0.5
      MouthFrownRight: 0.5
      MouthDimpleLeft: 0.8
      MouthDimpleRight: 0.8
      MouthStretchLeft: 0.05
      MouthStretchRight: 0.05
      MouthRollLower: 0.8
      MouthRollUpper: 0.5
      MouthShrugLower: 1.0
      MouthShrugUpper: 0.4
      MouthPressLeft: 0.8
      MouthPressRight: 0.8
      MouthLowerDownLeft: 0.8
      MouthLowerDownRight: 0.8
      MouthUpperUpLeft: 0.8
      MouthUpperUpRight: 0.8
      BrowDownLeft: 1.2
      BrowDownRight: 1.2
      BrowInnerUp: 1.3
      BrowOuterUpLeft: 0.8
      BrowOuterUpRight: 0.8
      CheekPuff: 0.2
      CheekSquintLeft: 1.0
      CheekSquintRight: 1.0
      NoseSneerLeft: 0.8
      NoseSneerRight: 0.8
      TongueOut: 0.0

    weight_offsets:  # Modulates the effect of each blendshapes. blendshape_values * weight_multipliers + weight_offsets
      EyeBlinkLeft: 0.0
      EyeLookDownLeft: 0.0
      EyeLookInLeft: 0.0
      EyeLookOutLeft: 0.0
      EyeLookUpLeft: 0.0
      EyeSquintLeft: 0.0
      EyeWideLeft: 0.0
      EyeBlinkRight: 0.0
      EyeLookDownRight: 0.0
      EyeLookInRight: 0.0
      EyeLookOutRight: 0.0
      EyeLookUpRight: 0.0
      EyeSquintRight: 0.0
      EyeWideRight: 0.0
      JawForward: 0.0
      JawLeft: 0.0
      JawRight: 0.0
      JawOpen: 0.0
      MouthClose: 0.0
      MouthFunnel: 0.0
      MouthPucker: 0.0
      MouthLeft: 0.0
      MouthRight: 0.0
      MouthSmileLeft: 0.0
      MouthSmileRight: 0.0
      MouthFrownLeft: 0.0
      MouthFrownRight: 0.0
      MouthDimpleLeft: 0.0
      MouthDimpleRight: 0.0
      MouthStretchLeft: 0.0
      MouthStretchRight: 0.0
      MouthRollLower: 0.0
      MouthRollUpper: 0.0
      MouthShrugLower: 0.0
      MouthShrugUpper: 0.0
      MouthPressLeft: 0.0
      MouthPressRight: 0.0
      MouthLowerDownLeft: 0.0
      MouthLowerDownRight: 0.0
      MouthUpperUpLeft: 0.0
      MouthUpperUpRight: 0.0
      BrowDownLeft: 0.0
      BrowDownRight: 0.0
      BrowInnerUp: 0.0
      BrowOuterUpLeft: 0.0
      BrowOuterUpRight: 0.0
      CheekPuff: 0.0
      CheekSquintLeft: 0.0
      CheekSquintRight: 0.0
      NoseSneerLeft: 0.0
      NoseSneerRight: 0.0
      TongueOut: 0.0

    active_poses: # Define which poses are active and which one are not
      EyeBlinkLeft: 1
      EyeLookDownLeft: 0
      EyeLookInLeft: 0
      EyeLookOutLeft: 0
      EyeLookUpLeft: 0
      EyeSquintLeft: 1
      EyeWideLeft: 1
      EyeBlinkRight: 1
      EyeLookDownRight: 0
      EyeLookInRight: 0
      EyeLookOutRight: 0
      EyeLookUpRight: 0
      EyeSquintRight: 1
      EyeWideRight: 1
      JawForward: 1
      JawLeft: 1
      JawRight: 1
      JawOpen: 1
      MouthClose: 1
      MouthFunnel: 1
      MouthPucker: 1
      MouthLeft: 1
      MouthRight: 1
      MouthSmileLeft: 1
      MouthSmileRight: 1
      MouthFrownLeft: 1
      MouthFrownRight: 1
      MouthDimpleLeft: 1
      MouthDimpleRight: 1
      MouthStretchLeft: 1
      MouthStretchRight: 1
      MouthRollLower: 1
      MouthRollUpper: 1
      MouthShrugLower: 1
      MouthShrugUpper: 1
      MouthPressLeft: 1
      MouthPressRight: 1
      MouthLowerDownLeft: 1
      MouthLowerDownRight: 1
      MouthUpperUpLeft: 1
      MouthUpperUpRight: 1
      BrowDownLeft: 1
      BrowDownRight: 1
      BrowInnerUp: 1
      BrowOuterUpLeft: 1
      BrowOuterUpRight: 1
      CheekPuff: 1
      CheekSquintLeft: 1
      CheekSquintRight: 1
      NoseSneerLeft: 1
      NoseSneerRight: 1
      TongueOut: 0

    cancel_poses: # Define which poses cancel each other
      EyeBlinkLeft: -1
      EyeLookDownLeft: -1
      EyeLookInLeft: -1
      EyeLookOutLeft: -1
      EyeLookUpLeft: -1
      EyeSquintLeft: -1
      EyeWideLeft: -1
      EyeBlinkRight: -1
      EyeLookDownRight: -1
      EyeLookInRight: -1
      EyeLookOutRight: -1
      EyeLookUpRight: -1
      EyeSquintRight: -1
      EyeWideRight: -1
      JawForward: -1
      JawLeft: -1
      JawRight: -1
      JawOpen: -1
      MouthClose: -1
      MouthFunnel: -1
      MouthPucker: -1
      MouthLeft: -1
      MouthRight: -1
      MouthSmileLeft: -1
      MouthSmileRight: -1
      MouthFrownLeft: -1
      MouthFrownRight: -1
      MouthDimpleLeft: -1
      MouthDimpleRight: -1
      MouthStretchLeft: -1
      MouthStretchRight: -1
      MouthRollLower: -1
      MouthRollUpper: -1
      MouthShrugLower: -1
      MouthShrugUpper: -1
      MouthPressLeft: -1
      MouthPressRight: -1
      MouthLowerDownLeft: -1
      MouthLowerDownRight: -1
      MouthUpperUpLeft: -1
      MouthUpperUpRight: -1
      BrowDownLeft: -1
      BrowDownRight: -1
      BrowInnerUp: -1
      BrowOuterUpLeft: -1
      BrowOuterUpRight: -1
      CheekPuff: -1
      CheekSquintLeft: -1
      CheekSquintRight: -1
      NoseSneerLeft: -1
      NoseSneerRight: -1
      TongueOut: -1

    symmetry_poses: # Define which poses are symmetric to each other
      EyeBlinkLeft: 0
      EyeLookDownLeft: -1
      EyeLookInLeft: -1
      EyeLookOutLeft: -1
      EyeLookUpLeft: -1
      EyeSquintLeft: -1
      EyeWideLeft: 1
      EyeBlinkRight: 0
      EyeLookDownRight: -1
      EyeLookInRight: -1
      EyeLookOutRight: -1
      EyeLookUpRight: -1
      EyeSquintRight: -1
      EyeWideRight: 1
      JawForward: -1
      JawLeft: -1
      JawRight: -1
      JawOpen: -1
      MouthClose: -1
      MouthFunnel: -1
      MouthPucker: -1
      MouthLeft: -1
      MouthRight: -1
      MouthSmileLeft: 2
      MouthSmileRight: 2
      MouthFrownLeft: 3
      MouthFrownRight: 3
      MouthDimpleLeft: 4
      MouthDimpleRight: 4
      MouthStretchLeft: 5
      MouthStretchRight: 5
      MouthRollLower: -1
      MouthRollUpper: -1
      MouthShrugLower: -1
      MouthShrugUpper: -1
      MouthPressLeft: 6
      MouthPressRight: 6
      MouthLowerDownLeft: 7
      MouthLowerDownRight: 7
      MouthUpperUpLeft: 8
      MouthUpperUpRight: 8
      BrowDownLeft: 10
      BrowDownRight: 10
      BrowInnerUp: -1
      BrowOuterUpLeft: 9
      BrowOuterUpRight: 9
      CheekPuff: -1
      CheekSquintLeft: 11
      CheekSquintRight: 11
      NoseSneerLeft: 12
      NoseSneerRight: 12
      TongueOut: -1

mark_stylization_config.yaml

# These are the default emotions applied at the beginning of any audio clip.
# Their values range from 0.0 to 1.0
default_beginning_emotions:
  amazement: 0.0
  anger: 0.0
  cheekiness: 0.0
  disgust: 0.0
  fear: 0.0
  grief: 0.0
  joy: 0.0
  outofbreath: 0.0
  pain: 0.0
  sadness: 0.0

a2e:
  enabled: true
  live_transition_time: 0.5
  post_processing_params:
    emotion_contrast: 1.0 # Increases the spread between emotion values by pushing them higher or lower
    emotion_strength: 0.6 # Sets the strength of generated emotions relative to neutral emotion
    enable_preferred_emotion: true # Activate blending preferred emotion with auto-emotion
    live_blend_coef: 0.7 # Coefficient for exponential smoothing of emotion
    max_emotions: 3 # Sets a firm limit on the quantity of emotion sliders engaged by A2E - emotions with the highest weight will be prioritized
    preferred_emotion_strength: 0.5 # Sets the strength of the preferred emotion (if is loaded) relative to generated emotions

a2f:
  # A2F model, can be one of james_v2.3, claire_v2.3 or mark_v2.3
  inference_model_id: mark_v2.3
  blendshape_id: mark_topo1_v2.1

  face_params:
    eyelid_offset: 0.06 # Adjusts the default pose of eyelid open-close
    face_mask_level: 0.6 # Determines the boundary between the upper and lower regions of the face
    face_mask_softness: 0.0085 # Determines how smoothly the upper and lower face regions blend on the boundary
    input_strength: 1.3 # Controls the magnitude of the input audio
    lip_close_offset: -0.03 # Adjusts the default pose of lip close-open
    lower_face_smoothing: 0.0023 # Applies temporal smoothing to the lower face motion
    lower_face_strength: 1.4 # Controls the range of motion on the lower regions of the face
    skin_strength: 1.1 # Controls the range of motion of the skin
    upper_face_smoothing: 0.001 # Applies temporal smoothing to the upper face motion
    upper_face_strength: 1.0 # Controls the range of motion on the upper regions of the face

  blendshape_params: # Modulates the effect of each blendshapes. Gain * w + offset
    enable_clamping_bs_weight: false

    weight_multipliers:
      EyeBlinkLeft: 1.0
      EyeLookDownLeft: 0.0
      EyeLookInLeft: 0.0
      EyeLookOutLeft: 0.0
      EyeLookUpLeft: 0.0
      EyeSquintLeft: 1.0
      EyeWideLeft: 1.0
      EyeBlinkRight: 1.0
      EyeLookDownRight: 0.0
      EyeLookInRight: 0.0
      EyeLookOutRight: 0.0
      EyeLookUpRight: 0.0
      EyeSquintRight: 1.0
      EyeWideRight: 1.0
      JawForward: 0.7
      JawLeft: 0.2
      JawRight: 0.2
      JawOpen: 1.0
      MouthClose: 0.2
      MouthFunnel: 1.2
      MouthPucker: 1.2
      MouthLeft: 0.2
      MouthRight: 0.2
      MouthSmileLeft: 0.8
      MouthSmileRight: 0.8
      MouthFrownLeft: 0.5
      MouthFrownRight: 0.5
      MouthDimpleLeft: 0.8
      MouthDimpleRight: 0.8
      MouthStretchLeft: 0.05
      MouthStretchRight: 0.05
      MouthRollLower: 0.8
      MouthRollUpper: 0.5
      MouthShrugLower: 0.9
      MouthShrugUpper: 0.4
      MouthPressLeft: 0.8
      MouthPressRight: 0.8
      MouthLowerDownLeft: 0.8
      MouthLowerDownRight: 0.8
      MouthUpperUpLeft: 0.8
      MouthUpperUpRight: 0.8
      BrowDownLeft: 1.0
      BrowDownRight: 1.0
      BrowInnerUp: 1.0
      BrowOuterUpLeft: 1.0
      BrowOuterUpRight: 1.0
      CheekPuff: 0.2
      CheekSquintLeft: 1.0
      CheekSquintRight: 1.0
      NoseSneerLeft: 0.8
      NoseSneerRight: 0.8
      TongueOut: 0.0

    weight_offsets:  # Modulates the effect of each blendshapes. blendshape_values * weight_multipliers + weight_offsets
      EyeBlinkLeft: 0.0
      EyeLookDownLeft: 0.0
      EyeLookInLeft: 0.0
      EyeLookOutLeft: 0.0
      EyeLookUpLeft: 0.0
      EyeSquintLeft: 0.0
      EyeWideLeft: 0.0
      EyeBlinkRight: 0.0
      EyeLookDownRight: 0.0
      EyeLookInRight: 0.0
      EyeLookOutRight: 0.0
      EyeLookUpRight: 0.0
      EyeSquintRight: 0.0
      EyeWideRight: 0.0
      JawForward: 0.0
      JawLeft: 0.0
      JawRight: 0.0
      JawOpen: 0.0
      MouthClose: 0.0
      MouthFunnel: 0.0
      MouthPucker: 0.0
      MouthLeft: 0.0
      MouthRight: 0.0
      MouthSmileLeft: 0.0
      MouthSmileRight: 0.0
      MouthFrownLeft: 0.0
      MouthFrownRight: 0.0
      MouthDimpleLeft: 0.0
      MouthDimpleRight: 0.0
      MouthStretchLeft: 0.0
      MouthStretchRight: 0.0
      MouthRollLower: 0.0
      MouthRollUpper: 0.0
      MouthShrugLower: 0.0
      MouthShrugUpper: 0.0
      MouthPressLeft: 0.0
      MouthPressRight: 0.0
      MouthLowerDownLeft: 0.0
      MouthLowerDownRight: 0.0
      MouthUpperUpLeft: 0.0
      MouthUpperUpRight: 0.0
      BrowDownLeft: 0.0
      BrowDownRight: 0.0
      BrowInnerUp: 0.0
      BrowOuterUpLeft: 0.0
      BrowOuterUpRight: 0.0
      CheekPuff: 0.0
      CheekSquintLeft: 0.0
      CheekSquintRight: 0.0
      NoseSneerLeft: 0.0
      NoseSneerRight: 0.0
      TongueOut: 0.0

    active_poses: # Define which poses are active and which one are not
      EyeBlinkLeft: 1
      EyeLookDownLeft: 0
      EyeLookInLeft: 0
      EyeLookOutLeft: 0
      EyeLookUpLeft: 0
      EyeSquintLeft: 1
      EyeWideLeft: 1
      EyeBlinkRight: 1
      EyeLookDownRight: 0
      EyeLookInRight: 0
      EyeLookOutRight: 0
      EyeLookUpRight: 0
      EyeSquintRight: 1
      EyeWideRight: 1
      JawForward: 1
      JawLeft: 1
      JawRight: 1
      JawOpen: 1
      MouthClose: 1
      MouthFunnel: 1
      MouthPucker: 1
      MouthLeft: 1
      MouthRight: 1
      MouthSmileLeft: 1
      MouthSmileRight: 1
      MouthFrownLeft: 1
      MouthFrownRight: 1
      MouthDimpleLeft: 1
      MouthDimpleRight: 1
      MouthStretchLeft: 1
      MouthStretchRight: 1
      MouthRollLower: 1
      MouthRollUpper: 1
      MouthShrugLower: 1
      MouthShrugUpper: 1
      MouthPressLeft: 1
      MouthPressRight: 1
      MouthLowerDownLeft: 1
      MouthLowerDownRight: 1
      MouthUpperUpLeft: 1
      MouthUpperUpRight: 1
      BrowDownLeft: 1
      BrowDownRight: 1
      BrowInnerUp: 1
      BrowOuterUpLeft: 1
      BrowOuterUpRight: 1
      CheekPuff: 1
      CheekSquintLeft: 1
      CheekSquintRight: 1
      NoseSneerLeft: 1
      NoseSneerRight: 1
      TongueOut: 0

    cancel_poses: # Define which poses cancel each other
      EyeBlinkLeft: -1
      EyeLookDownLeft: -1
      EyeLookInLeft: -1
      EyeLookOutLeft: -1
      EyeLookUpLeft: -1
      EyeSquintLeft: -1
      EyeWideLeft: -1
      EyeBlinkRight: -1
      EyeLookDownRight: -1
      EyeLookInRight: -1
      EyeLookOutRight: -1
      EyeLookUpRight: -1
      EyeSquintRight: -1
      EyeWideRight: -1
      JawForward: -1
      JawLeft: -1
      JawRight: -1
      JawOpen: -1
      MouthClose: -1
      MouthFunnel: -1
      MouthPucker: -1
      MouthLeft: -1
      MouthRight: -1
      MouthSmileLeft: -1
      MouthSmileRight: -1
      MouthFrownLeft: -1
      MouthFrownRight: -1
      MouthDimpleLeft: -1
      MouthDimpleRight: -1
      MouthStretchLeft: -1
      MouthStretchRight: -1
      MouthRollLower: -1
      MouthRollUpper: -1
      MouthShrugLower: -1
      MouthShrugUpper: -1
      MouthPressLeft: -1
      MouthPressRight: -1
      MouthLowerDownLeft: -1
      MouthLowerDownRight: -1
      MouthUpperUpLeft: -1
      MouthUpperUpRight: -1
      BrowDownLeft: -1
      BrowDownRight: -1
      BrowInnerUp: -1
      BrowOuterUpLeft: -1
      BrowOuterUpRight: -1
      CheekPuff: -1
      CheekSquintLeft: -1
      CheekSquintRight: -1
      NoseSneerLeft: -1
      NoseSneerRight: -1
      TongueOut: -1

    symmetry_poses: # Define which poses are symmetric to each other
      EyeBlinkLeft: 0
      EyeLookDownLeft: -1
      EyeLookInLeft: -1
      EyeLookOutLeft: -1
      EyeLookUpLeft: -1
      EyeSquintLeft: -1
      EyeWideLeft: 1
      EyeBlinkRight: 0
      EyeLookDownRight: -1
      EyeLookInRight: -1
      EyeLookOutRight: -1
      EyeLookUpRight: -1
      EyeSquintRight: -1
      EyeWideRight: 1
      JawForward: -1
      JawLeft: -1
      JawRight: -1
      JawOpen: -1
      MouthClose: -1
      MouthFunnel: -1
      MouthPucker: -1
      MouthLeft: -1
      MouthRight: -1
      MouthSmileLeft: 2
      MouthSmileRight: 2
      MouthFrownLeft: 3
      MouthFrownRight: 3
      MouthDimpleLeft: 4
      MouthDimpleRight: 4
      MouthStretchLeft: 5
      MouthStretchRight: 5
      MouthRollLower: -1
      MouthRollUpper: -1
      MouthShrugLower: -1
      MouthShrugUpper: -1
      MouthPressLeft: 6
      MouthPressRight: 6
      MouthLowerDownLeft: 7
      MouthLowerDownRight: 7
      MouthUpperUpLeft: 8
      MouthUpperUpRight: 8
      BrowDownLeft: 10
      BrowDownRight: 10
      BrowInnerUp: -1
      BrowOuterUpLeft: 9
      BrowOuterUpRight: 9
      CheekPuff: -1
      CheekSquintLeft: 11
      CheekSquintRight: 11
      NoseSneerLeft: 12
      NoseSneerRight: 12
      TongueOut: -1

参数调整指南#

Audio2Face-3D 从多个来源导入推理参数：推理模型 SDK、部署时配置文件和运行时输入。通常，部署时参数会覆盖模型文件中匹配的参数，而运行时参数会覆盖部署时和模型默认参数。

有关运行时参数，请参阅 AudioStreamHeader 和 FaceParameters、BlendShapeParameters、EmotionParameters、EmotionPostProcessingParameters 以获取 proto 定义。

环境变量#

下表描述了可以作为 -e 参数传递给 Audio2Face-3D NIM 的环境变量，该参数添加到 docker run 命令中

变量 (Variable)	必需 (Required)	值 (Values)	注释 (Notes)
NGC_API_KEY	否 (No)	任何表示有效 NGC API 密钥的字符串	仅当您想从 NGC 下载 TRT 引擎时才需要。您必须将此变量设置为您的个人 NGC API 密钥的值。
NIM_LOGGING_JSONL	否 (No)	true / false	启用 (true) 或禁用 (false) JSON Lines 格式日志记录到 stdout。
NIM_MANIFEST_PROFILE	否 (No)	任何有效的 manifest profile 字符串	从支持的模型 (Supported Models) 中为您的 GPU 选择 manifest profile ID。
NIM_DISABLE_MODEL_DOWNLOAD	否 (No)	true / false	禁用 (true) 或启用 (false) 从 NGC 自动下载 TRT 引擎。
NIM_SKIP_A2F_START	否 (No)	true / false	如果设置为 true，容器启动时将不会启动 A2F-3D 服务。

卷 (Volumes)#

下表描述了容器内部的路径，本地路径可以挂载到这些路径中。例如，您可以使用以下 docker 标志挂载卷 -v {LOCAL_PATH}:{PATH_IN_CONTAINER}。

容器路径 (Container path)	必需 (Required)	注释 (Notes)
/tmp/a2x/	不是必需的，但如果未挂载此卷，则容器每次启动时都必须重新下载或生成模型	AI 模型路径。必须具有执行、读取和写入权限或 777。
/mnt/configs/	仅在您想要覆盖某些配置参数的情况下才需要	用于覆盖配置的文件路径

Audio2Face-3D 微服务的快速部署 (Quick Deployment of Audio2Face-3D Microservices)#

除了部署 Audio2Face-3D 并手动启动模型之外，您还可以使用 docker-compose 文件，按照 quick-start 说明，在 NVIDIA Audio2Face-3D Samples 仓库中提供的说明，快速部署它们。