将 NIM 服务作为 NIM 管道进行管理#

关于 NIM 管道#

除了使用多个 NIMService 自定义资源单独管理 NIM 服务之外，您还可以使用一个 NIMPipeline 自定义资源管理多个 NIM 服务。

以下示例清单仅为 LLM 部署 NIM。

apiVersion: apps.nvidia.com/v1alpha1
kind: NIMPipeline
metadata:
  name: pipeline-all
spec:
  services:
    - name: meta-llama3-8b-instruct
      enabled: true
      spec:
        image:
          repository: nvcr.io/nim/meta/llama-3.1-8b-instruct
          tag: 1.3.3
          pullPolicy: IfNotPresent
          pullSecrets:
          - ngc-secret
        authSecret: ngc-api-secret
        storage:
          nimCache:
            name: meta-llama3-8b-instruct
            profile: ''
        replicas: 1
        resources:
          limits:
            nvidia.com/gpu: 1
        expose:
          service:
            type: ClusterIP
            port: 8000

请参考下表，了解常用修改字段的信息

字段	描述	默认值
`spec.services.enabled`	当设置为 `true` 时，Operator 会部署 NIM 服务。	`false`
`spec.services.name`	指定 NIM 服务的名称。	无
`spec.services.spec`	指定一个 `NIMService` 自定义资源，该资源代表 NIM 微服务。	无

前提条件#

每个 NIM 微服务的 NIM 缓存，或者您可以指定在 NIM 服务规范的 spec.storage.pvc 字段中的 PVC。

步骤#

创建一个文件，例如 pipeline-all.yaml，内容如下例所示

apiVersion: apps.nvidia.com/v1alpha1
kind: NIMPipeline
metadata:
  name: pipeline-all
spec:
  services:
    - name: meta-llama3-8b-instruct
      enabled: true
      spec:
        image:
          repository: nvcr.io/nim/meta/llama-3.1-8b-instruct
          tag: 1.3.3
          pullPolicy: IfNotPresent
          pullSecrets:
          - ngc-secret
        authSecret: ngc-api-secret
        storage:
          nimCache:
            name: meta-llama3-8b-instruct
            profile: ''
        replicas: 1
        resources:
          limits:
            nvidia.com/gpu: 1
        expose:
          service:
            type: ClusterIP
            port: 8000
    - name: nv-embedqa-e5-v5
      enabled: true
      spec:
        image:
          repository: nvcr.io/nim/nvidia/llama-3.2-nv-embedqa-1b-v2
          tag: 1.3.1
          pullPolicy: IfNotPresent
          pullSecrets:
          - ngc-secret
        authSecret: ngc-api-secret
        storage:
          nimCache:
            name: nv-embedqa-e5-v5
            profile: ''
        replicas: 1
        resources:
          limits:
            nvidia.com/gpu: 1
        expose:
          service:
            type: ClusterIP
            port: 8000
    - name: nv-rerank-mistral-4b-v3
      enabled: true
      spec:
        image:
          repository: nvcr.io/nim/nvidia/llama-3.2-nv-rerankqa-1b-v2
          tag: 1.3.1
          pullPolicy: IfNotPresent
          pullSecrets:
          - ngc-secret
        authSecret: ngc-api-secret
        storage:
          nimCache:
            name: nv-rerankqa-mistral-4b-v3
            profile: ''
        replicas: 1
        resources:
          limits:
            nvidia.com/gpu: 1
        expose:
          service:
            type: ClusterIP
            port: 8000

应用清单

$ kubectl apply -n nim-service -f pipeline-all.yaml

可选：查看关于管道的信息

$ kubectl describe nimpipelines.apps.nvidia.com -n nim-service

请参考验证以确认 NIM for LLMs 微服务可用。

删除 NIM 管道#

要删除管道并移除与服务关联的资源和对象，请执行以下步骤

查看管道自定义资源

$ kubectl get nimpipelines.apps.nvidia.com -A

示例输出

NAMESPACE    NAME          STATUS
nim-service  pipeline-all  deployed

删除自定义资源

$ kubectl delete nimpipelines.apps.nvidia.com -n nim-service pipeline-all

后续步骤#

部署应用程序以使用 NIM 服务，例如示例 RAG 应用程序。