在 Kubernetes 上运行#

安装 Docker#

首先，您需要设置存储库。

使用以下命令更新 apt 软件包索引

sudo apt-get update

安装软件包以允许 apt 通过 HTTPS 使用存储库

sudo apt-get install -y \
  apt-transport-https \
  ca-certificates \
  curl \
  gnupg-agent \
  software-properties-common

接下来，您需要使用以下命令添加 Docker 的官方 GPG 密钥

curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -

通过搜索指纹的最后 8 个字符，验证您现在是否拥有指纹为 9DC8 5822 9FC7 DD38 854A E2D8 8D81 803C 0EBF CD88 的密钥

sudo apt-key fingerprint 0EBFCD88

pub   rsa4096 2017-02-22 [SCEA]
    9DC8 5822 9FC7 DD38 854A  E2D8 8D81 803C 0EBF CD88
uid           [ unknown] Docker Release (CE deb) <docker@docker.com>
sub   rsa4096 2017-02-22 [S]

使用以下命令设置稳定存储库

sudo add-apt-repository \
"deb [arch=amd64] https://download.docker.com/linux/ubuntu \
$(lsb_release -cs) \
stable"

安装 Docker Engine - Community，更新 apt 软件包索引

sudo apt-get update

安装 Docker Engine

sudo apt-get install -y docker-ce=5:19.03.12~3-0~ubuntu-bionic docker-ce-cli=5:19.03.12~3-0~ubuntu-bionic containerd.io

通过运行 hello-world 镜像，验证 Docker Engine - Community 是否正确安装

sudo docker run hello-world

有关如何安装 Docker 的更多信息，请点击此处。

安装 Kubernetes#

在开始安装之前，请确保 Docker 已启动并启用

sudo systemctl start docker && sudo systemctl enable docker

执行以下命令以添加 apt 密钥

sudo apt-get update && sudo apt-get install -y apt-transport-https curl
curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add -
sudo mkdir -p /etc/apt/sources.list.d/

创建 kubernetes.list

cat <<EOF | sudo tee /etc/apt/sources.list.d/kubernetes.list
deb https://apt.kubernetes.io/ kubernetes-xenial main
EOF

现在执行以下命令以安装 kubelet、kubeadm 和 kubectl

sudo apt-get update
sudo apt-get install -y -q kubelet=1.21.1-00 kubectl=1.21.1-00 kubeadm=1.21.1-00
sudo apt-mark hold kubelet kubeadm kubectl

重新加载系统守护程序

sudo systemctl daemon-reload

禁用交换分区#

sudo swapoff -a
sudo nano /etc/fstab

注意

在所有以 /swap 开头的行之前添加 #。# 是注释，结果应如下所示

UUID=e879fda9-4306-4b5b-8512-bba726093f1d / ext4 defaults 0 0
UUID=DCD4-535C /boot/efi vfat defaults 0 0
#/swap.img       none    swap    sw      0       0

初始化 Kubernetes 集群以作为控制平面节点运行#

执行以下命令

sudo kubeadm init --pod-network-cidr=192.168.0.0/16

输出

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

    mkdir -p $HOME/.kube
    sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
    sudo chown $(id -u):$(id -g) $HOME/.kube/config

Alternatively, if you are the root user, you can run:

    export KUBECONFIG=/etc/kubernetes/admin.conf

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.ac.cn/docs/concepts/cluster-administration/addons/

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join <your-host-IP>:6443 --token 489oi5.sm34l9uh7dk4z6cm \
        --discovery-token-ca-cert-hash sha256:17165b6c4a4b95d73a3a2a83749a957a10161ae34d2dfd02cd730597579b4b34

按照输出中的说明，执行如下所示的命令

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

使用以下命令，您可以在控制平面节点上安装 pod 网络附加组件。我们在此处使用 calico 作为 pod 网络附加组件

kubectl apply -f https://docs.projectcalico.org/manifests/calico.yaml

您可以执行以下命令以确保所有 pod 都已启动并正在运行

kubectl get pods --all-namespaces

输出

NAMESPACE     NAME                                       READY   STATUS    RESTARTS   AGE
kube-system   calico-kube-controllers-65b8787765-bjc8h   1/1     Running   0          2m8s
kube-system   calico-node-c2tmk                          1/1     Running   0          2m8s
kube-system   coredns-5c98db65d4-d4kgh                   1/1     Running   0          9m8s
kube-system   coredns-5c98db65d4-h6x8m                   1/1     Running   0          9m8s
kube-system   etcd-#yourhost                             1/1     Running   0          8m25s
kube-system   kube-apiserver-#yourhost                   1/1     Running   0          8m7s
kube-system   kube-controller-manager-#yourhost          1/1     Running   0          8m3s
kube-system   kube-proxy-6sh42                           1/1     Running   0          9m7s
kube-system   kube-scheduler-#yourhost                   1/1     Running   0          8m26s

get nodes 命令显示控制平面节点已启动并就绪

kubectl get nodes

输出

NAME             STATUS   ROLES                  AGE   VERSION
#yourhost        Ready    control-plane,master   10m   v1.21.1

由于我们使用的是单节点 Kubernetes 集群，因此默认情况下，集群不会在控制平面节点上调度 pod。要在控制平面节点上调度 pod，我们必须通过执行以下命令来删除污点

kubectl taint nodes --all node-role.kubernetes.io/master-

安装 Helm#

执行以下命令下载并安装 Helm 3.5.4

wget https://get.helm.sh/helm-v3.5.4-linux-amd64.tar.gz
tar -zxvf helm-v3.5.4-linux-amd64.tar.gz
sudo mv linux-amd64/helm /usr/local/bin/helm
rm -rf helm-v3.5.4-linux-amd64.tar.gz linux-amd64/

有关更多信息，请参阅 helm/helm 和 https://helm.kubernetes.ac.cn/docs/using_helm/#installing-helm。

NVIDIA 网络运营商#

Prerequisites（先决条件）#

注意

如果 Mellanox 网卡未连接到您的节点，请跳过此步骤并继续下一步安装 GPU 运营商。

以下说明假定 Mellanox 网卡已连接到您的机器。

执行以下命令以验证 Mellanox 网卡是否在您的机器上启用

lspci | grep -i "Mellanox"

输出

0c:00.0 Ethernet controller: Mellanox Technologies MT2892 Family [ConnectX-6 Dx]
0c:00.1 Ethernet controller: Mellanox Technologies MT2892 Family [ConnectX-6 Dx]

执行以下命令以了解哪个 Mellanox 设备处于活动状态

注意

在后续步骤中使用显示为 Link Detected: yes 的设备。以下命令仅在您在安装操作系统之前添加网卡时才有效。

for device in `sudo lshw -class network -short | grep -i ConnectX | awk '{print $2}' | egrep -v 'Device|path' | sed '/^$/d'`;do echo -n $device; sudo ethtool $device | grep -i "Link detected"; done

输出

ens160f0        Link detected: yes
ens160f1        Link detected: no

创建自定义网络运营商 values.yaml。

nano network-operator-values.yaml

从以上命令更新活动的 Mellanox 设备。

deployCR: true
ofedDriver:
deploy: true
nvPeerDriver:
deploy: true
rdmaSharedDevicePlugin:
deploy: true
resources:
    - name: rdma_shared_device_a
    vendors: [15b3]
    devices: [ens160f0]

有关自定义网络运营商 values.yaml 的更多信息，请参阅网络运营商。

添加 NVIDIA 仓库

注意

安装网络运营商需要 Helm。

helm repo add mellanox https://mellanox.github.io/network-operator

更新 Helm 仓库

helm repo update

安装 NVIDIA 网络运营商#

执行以下命令

kubectl label nodes --all node-role.kubernetes.io/master- --overwrite
helm install -f ./network-operator-values.yaml -n network-operator --create-namespace --wait network-operator mellanox/network-operator

验证网络运营商的状态#

请注意，网络运营商的安装可能需要几分钟时间。安装所需的时间长短取决于您的互联网速度。

kubectl get pods --all-namespaces | egrep 'network-operator|nvidia-network-operator-resources'

NAMESPACE                           NAME                                                              READY   STATUS      RESTARTS   AGE
network-operator                    network-operator-547cb8d999-mn2h9                                 1/1     Running            0          17m
network-operator                    network-operator-node-feature-discovery-master-596fb8b7cb-qrmvv   1/1     Running            0          17m
network-operator                    network-operator-node-feature-discovery-worker-qt5xt              1/1     Running            0          17m
nvidia-network-operator-resources   cni-plugins-ds-dl5vl                                              1/1     Running            0          17m
nvidia-network-operator-resources   kube-multus-ds-w82rv                                              1/1     Running            0          17m
nvidia-network-operator-resources   mofed-ubuntu20.04-ds-xfpzl                                        1/1     Running            0          17m
nvidia-network-operator-resources   rdma-shared-dp-ds-2hgb6                                           1/1     Running            0          17m
nvidia-network-operator-resources   sriov-device-plugin-ch7bz                                         1/1     Running            0          10m
nvidia-network-operator-resources   whereabouts-56ngr                                                 1/1     Running            0          10m

有关更多信息，请参阅网络运营商页面。

NVIDIA GPU 运营商#

NVIDIA AI Enterprise 客户可以访问 NVIDIA NGC 目录中预配置的 GPU 运营商。GPU 运营商经过预配置，以简化 NVIDIA AI Enterprise 部署的配置体验。

预配置的 GPU 运营商与公共 NGC 目录中的 GPU 运营商不同。不同之处在于：

它配置为使用预构建的 vGPU 驱动程序镜像（仅适用于 NVIDIA AI Enterprise 客户）。
它配置为使用 NVIDIA 许可系统 (NLS)。

安装 GPU 运营商#

注意

带有 NVIDIA AI Enterprise 的 GPU 运营商需要在安装之前完成一些任务。有关在运行以下命令之前的说明，请参阅文档 NVIDIA AI Enterprise。

提示

NVIDIA GPU 运营商安装脚本也可在此处获取：here。

添加 NVIDIA AI Enterprise Helm 仓库，其中 api-key 是您生成的用于访问 NVIDIA Enterprise Collection 的 NGC API 密钥

helm repo add nvaie https://helm.ngc.nvidia.com/nvaie --username='$oauthtoken' --password=api-key && helm repo update

helm install --wait --generate-name nvaie/gpu-operator -n gpu-operator

许可 GPU 运营商#

将 NLS 许可证令牌复制到名为 client_configuration_token.tok 的文件中。
创建一个空的 gridd.conf 文件。
touch gridd.conf

为 NLS 许可创建 Configmap。

kubectl create configmap licensing-config -n gpu-operator --from-file=./gridd.conf --from-file=./client_configuration_token.tok

创建 K8s Secret 以访问 NGC 注册表。

kubectl create secret docker-registry ngc-secret --docker-server="nvcr.io/nvaie" --docker-username='$oauthtoken' --docker-password=’<YOUR API KEY>’ --docker-email=’<YOUR EMAIL>’ -n gpu-operator

带有 RDMA 的 GPU 运营商（可选）#

Prerequisites（先决条件）#

请安装网络运营商以确保已安装 MOFED 驱动程序。

在 NVIDIA 网络运营商安装完成后，执行以下命令以安装 GPU 运营商以加载 nv_peer_mem 模块。

helm install --wait gpu-operator nvaie/gpu-operator -n gpu-operator --set driver.rdma.enabled=true

验证带有 GPUDirect RDMA 的网络运营商#

执行以下命令以列出 Mellanox 网卡及其状态

kubectl exec -it $(kubectl get pods -n nvidia-network-operator-resources | grep mofed | awk '{print $1}') -n nvidia-network-operator-resources -- ibdev2netdev

输出

mlx5_0 port 1 ==> ens192f0 (Up)
mlx5_1 port 1 ==> ens192f1 (Down)

编辑 networkdefinition.yaml。

1nano networkdefinition.yaml

为 IPAM 创建网络定义，并将 ens192f0 替换为 master 的活动 Mellanox 设备。

apiVersion: k8s.cni.cncf.io/v1
kind: NetworkAttachmentDefinition
metadata:
annotations:
    k8s.v1.cni.cncf.io/resourceName: rdma/rdma_shared_device_a
name: rdma-net-ipam
namespace: default
spec:
config: |-
    {
        "cniVersion": "0.3.1",
        "name": "rdma-net-ipam",
        "plugins": [
            {
                "ipam": {
                    "datastore": "kubernetes",
                    "kubernetes": {
                        "kubeconfig": "/etc/cni/net.d/whereabouts.d/whereabouts.kubeconfig"
                    },
                    "log_file": "/tmp/whereabouts.log",
                    "log_level": "debug",
                    "range": "192.168.111.0/24",
                    "type": "whereabouts"
                },
                "type": "macvlan",
                "master": "ens192f0",
                "vlan": 111
            },
            {
                "mtu": 1500,
                "type": "tuning"
            }
        ]
    }
EOF

注意

如果您在高性能端没有基于 VLAN 的网络，请将“vlan”设置为：0

验证 GPU 运营商的状态#

请注意，GPU 运营商的安装可能需要几分钟时间。安装所需的时间长短取决于您的互联网速度。

kubectl get pods --all-namespaces | grep -v kube-system

结果

NAMESPACE                NAME                                                              READY   STATUS      RESTARTS   AGE
default                  gpu-operator-1622656274-node-feature-discovery-master-5cddq96gq   1/1     Running     0          2m39s
default                  gpu-operator-1622656274-node-feature-discovery-worker-wr88v       1/1     Running     0          2m39s
default                  gpu-operator-7db468cfdf-mdrdp                                     1/1     Running     0          2m39s
gpu-operator-resources   gpu-feature-discovery-g425f                                       1/1     Running     0          2m20s
gpu-operator-resources   nvidia-container-toolkit-daemonset-mcmxj                          1/1     Running     0          2m20s
gpu-operator-resources   nvidia-cuda-validator-s6x2p                                       0/1     Completed   0          48s
gpu-operator-resources   nvidia-dcgm-exporter-wtxnx                                        1/1     Running     0          2m20s
gpu-operator-resources   nvidia-dcgm-jbz94                                                 1/1     Running     0          2m20s
gpu-operator-resources   nvidia-device-plugin-daemonset-hzzdt                              1/1     Running     0          2m20s
gpu-operator-resources   nvidia-device-plugin-validator-9nkxq                              0/1     Completed   0          17s
gpu-operator-resources   nvidia-driver-daemonset-kt8g5                                     1/1     Running     0          2m20s
gpu-operator-resources   nvidia-operator-validator-cw4j5                                   1/1     Running     0          2m20s

有关更多信息，请参阅 NGC 上的 GPU 运营商页面。