安装容器运行时#

在 2.0 版本中添加。

NVIDIA AI Enterprise 提供了一系列容器,用于运行 AI/ML 和数据科学工作负载。这些容器被打包并作为容器交付。Ubuntu 操作系统使用的容器运行时是 Docker,RHEL 使用的容器运行时是 Podman。

为 Ubuntu 安装 Docker#

请参考 Install Docker Engine on Ubuntu | Docker Documentation 以获取 Ubuntu 的最新安装步骤。

为 Red Hat Enterprise Linux 安装 Podman#

要安装 Podman,请遵循您支持的 Linux 发行版的官方说明。为了方便起见,以下文档包含在 RHEL 8 上安装 Podman 的说明。

  1. 在 RHEL 8 上,使用以下命令检查 container-tools 模块是否可用。

    sudo dnf module list | grep container-tools
    
  2. 这应该返回如下所示的输出。

    1container-tools      rhel8 [d]          common [d]                               Most recent (rolling) versions of podman, buildah, skopeo, runc, conmon, runc, conmon, CRIU, Udica, etc as well as dependencies such as container-selinux built and tested together, and updated as frequently as every 12 weeks.
    2container-tools      1.0                common [d]                               Stable versions of podman 1.0, buildah 1.5, skopeo 0.1, runc, conmon, CRIU, Udica, etc as well as dependencies such as container-selinux built and tested together, and supported for 24 months.
    3container-tools      2.0                common [d]                               Stable versions of podman 1.6, buildah 1.11, skopeo 0.1, runc, conmon, etc as well as dependencies such as container-selinux built and tested together, and supported as documented on the Application Stream lifecycle page.
    4container-tools      rhel8 [d]          common [d]                               Most recent (rolling) versions of podman, buildah, skopeo, runc, conmon, runc, conmon, CRIU, Udica, etc as well as dependencies such as container-selinux built and tested together, and updated as frequently as every 12 weeks.
    5container-tools      1.0                common [d]                               Stable versions of podman 1.0, buildah 1.5, skopeo 0.1, runc, conmon, CRIU, Udica, etc as well as dependencies such as container-selinux built and tested together, and supported for 24 months.
    6container-tools      2.0                common [d]                               Stable versions of podman 1.6, buildah 1.11, skopeo 0.1, runc, conmon, etc as well as dependencies such as container-selinux built and tested together, and supported as documented on the Application Stream lifecycle page.
    
  3. 现在,继续安装 container-tools 模块,这将使用以下命令安装 Podman。

    sudo dnf module install -y container-tools
    
  4. 一旦 Podman 安装完成,使用以下命令检查版本。

    1podman version
    2Version:      2.2.1
    3API Version:  2
    4Go Version:   go1.14.7
    5Built:        Mon Feb  8 21:19:06 2021
    6OS/Arch:      linux/amd64
    

无根容器设置(可选)#

注意

如果运行容器的用户是特权用户(例如 root),则不应进行此更改,否则将导致使用 NVIDIA Container Toolkit 的容器失败。

为了能够使用 Podman 运行无根容器,我们需要对 NVIDIA 运行时进行以下配置更改,命令如下。

sudo sed -i 's/^#no-cgroups = false/no-cgroups = true/;' /etc/nvidia-container-runtime/config.toml