高级用法#

安装程序标志和命令#

  • 可用标志

    -h, –help

    -v, –version

  • 可用命令

    • Completion - 为指定的 shell 生成 ./cnpctl_Linux_x86_64 的自动补全脚本。 有关如何使用生成的脚本的详细信息,请参阅每个子命令的帮助。

      用法

      ./cnpctl_Linux_x86_64 completion [command]
      

      可用命令

      bash - 为 bash 生成自动补全脚本

      fish - 为 fish 生成自动补全脚本

      powershell -为 powershell 生成自动补全脚本

      zsh - 为 zsh 生成自动补全脚本

      标志

      -h, –help - Completion 帮助

    • Create/Install - 创建 NVIDIA 云原生平台。

      -d, –directory - 字符串,如果非空,则将工作文件写入此目录。(默认为“.”)

      -f, –filename - 字符串,包含要应用的配置的文件路径。

      -h, –help - Create 帮助

      -kubeconfig - 字符串,用于 CLI 请求的 kubeconfig 文件的路径。 默认情况下,安装程序将查找 KUBECONFIG 环境变量以确定 kubeconfig 的位置,然后是默认的 $HOME/.kube/config 位置,除非通过此标志手动指定 kubeconfig 位置。

      -v, –verbose - 启用更详细的日志记录以进行调试。

    • Delete - 删除 NVIDIA 云原生平台。

      用法

      ./cnpctl_Linux_x86_64 delete [flags]
      

      别名

      delete, destroy

      标志

      -d, –directory - 字符串,如果非空,则将工作文件写入此目录。(默认为“.”)

      -h, –help - Delete 帮助

      -kubeconfig –kubeconfig - 字符串,用于 CLI 请求的 kubeconfig 文件的路径。 默认情况下,安装程序将查找 KUBECONFIG 环境变量以确定 kubeconfig 的位置,然后是默认的 $HOME/.kube/config 位置,除非通过此标志手动指定 kubeconfig 位置。

      -v, –verbose - 增加详细程度。

配置 YAML 自定义#

CNPack 安装程序可以在安装时使用配置文件进行配置。 此文件允许启用/禁用平台的​​所有组件,并进行配置以满足不同的用例。

注意

当前配置文件上没有依赖项检查。 如果禁用了不同组件所需的组件,则安装将失败。

下面的配置文件是一个 YAML 格式的文件,其结构类似于 Kubernetes 资源。 以下是所有配置选项以及有关如何使用它们的文档。

  1apiVersion: v1alpha1
  2kind: nvidiaplatform
  3spec:
  4    # The platform block contains general configuration that is important to all components
  5    platform:
  6        # Required value specifying the Wildcard Domain to configure for ingress.
  7        wildcardDomain: *.my-cluster.my-domain.com
  8        # Required value to specify the port to configure for ingress.
  9        externalPort: 443
 10        # Optional infrastructure provider configuration for AWS EKS
 11        eks:
 12            # The region in-which the cluster is installed.
 13            region: us-west-1
 14
 15    # The ingress block configures the ingress controller
 16    ingress:
 17        # Whether this component should be enabled Default is True.
 18        enabled: True
 19
 20    # The postgres block configures the postgres operator
 21    postgres:
 22        # Whether this component should be enabled Default is True.
 23        enabled: True
 24
 25    # The certManager block configures the certificate management system
 26    certManager:
 27        # Whether this component should be enabled Default is True.
 28        enabled: True
 29        # Optional configuration for the AWS Private CA service integration.
 30        #
 31        # Dependencies:
 32        #   - EKS Infrastructure provider configuration (spec.platform.eks)
 33        awsPCA:
 34            # Whether this component should be enabled Default is True.
 35            enabled: True
 36            # The ARN required to communicate with the AWS Private CA service.
 37            arn: ...
 38            # The common name of the configured Private CA.
 39            commonName: my-cert.my-domain.com
 40            # The domain name of the configured Private CA.
 41            domainName: my-domain.com
 42
 43    # The trustManager block configures the trust bundle management system
 44    #
 45    # Dependencies:
 46    #   - cert-manager
 47    trustManager:
 48        # Whether this component should be enabled Default is True.
 49        enabled: True
 50
 51    # The keycloack block configures Keycloak as an OIDC provider
 52    #
 53    # Dependencies:
 54    #   - cert-manager
 55    #   - postgres
 56    #   - ingress
 57    keycloak:
 58        # Whether this component should be enabled Default is True.
 59        enabled: True
 60        # The persitent value claim spec options to be used to request database storage. All Kubernets PVC Spec values are supported, but only the most typical are shown here.
 61        databaseStorage:
 62            # The access modes supported by your storage provider.
 63            accessModes:
 64                - ReadWriteOnce
 65            # The volume mode supported by your storage provider.
 66            volumeMode: Filesystem
 67            # The amount of storage requested.
 68            resources:
 69                requests:
 70                    storage: 10G
 71            # The name of your storage class.
 72            storageClassName: local-path
 73        # Optional value to override the hostname used to expose keycloak.
 74        customHostname: my-host.my-cluster.my-domain.com
 75        # Optional value to set the initial admin password to a specified value. By default, a random pasword will be generated.
 76        initialAdminPassword: My-Secret-Password-1
 77
 78    # The prometheus block configures the Prometheus metrics service
 79    #
 80    # Dependencies:
 81  #   - cert-manager
 82  prometheus:
 83      # Whether this component should be enabled Default is True.
 84      enabled: True
 85      # The persitent value claim spec options to be used to request Prometheus storage. All Kubernets PVC Spec values are supported, but only the most typical are shown here.
 86      databaseStorage:
 87          # The access modes supported by your storage provider.
 88          accessModes:
 89              - ReadWriteOnce
 90          # The volume mode supported by your storage provider.
 91          volumeMode: Filesystem
 92          # The amount of storage requested.
 93          resources:
 94              requests:
 95                  storage: 10G
 96          # The name of your storage class.
 97          storageClassName: local-path
 98      # Optional configuration for connecting Prometheus to an AWS Managed Prometheus instance.
 99      awsRemoteWrite:
100          # The URL of the AWS managed prometheus service.
101          url: https://...
102          # The ARN required to communicate with the AWS Managed Prometheus Service.
103          arn: ...
104
105  # The grafana block configures the Grafana dashboard service
106  #
107  # Dependencies:
108  #   - prometheus
109  #   - cert-manager
110  #   - ingress
111  grafana:
112      # Whether this component should be enabled Default is True.
113      enabled: True
114      # Optional value to override the hostname used to expose grafana.
115      customHostname: my-host.my-cluster.my-domain.com
116
117  # The elastic block configures the Elastic Cloud on Kubernetes operator
118  elastic:
119      # Whether this component should be enabled Default is True.
120      enabled: True
121
122  # The fluentbit block configures the fluentbit log aggregation service
123  #
124  # Dependencies:
125  #   - Infrastructure provider configuration (spec.platform.eks)
126  fluentbit:
127      # Whether this component should be enabled Default is True.
128      enabled: True

Ingress Controller 默认证书配置#

作为 HAProxy Ingress Controller 部署的一部分,已在 nvidia-platform 命名空间中创建了一个名为 nvidia-ingress-kubernetes-ingress-default-cert 的 secret,其中包含用于通配符域名的 TLS 证书和 TLS 密钥。 此证书可以替换为用户选择的签名证书,该证书针对 .my-cluster.my-domain.com 的通配符域名进行签名。