在 Dell R750 上安装工具#

本章介绍如何在主机上安装所需的内核、驱动程序和工具。这是一次性安装,如果系统已经配置过,则可以跳过。

  • 在以下步骤序列中,目标主机是 戴尔 PowerEdge R750

  • 根据版本,本节中安装的工具可能需要在安装和升级 Aerial cuBB部分中进行升级。

  • 安装和更新所有内容后,请参阅cuBB 快速入门指南,了解如何使用 Aerial cuBB。

戴尔 PowerEdge R750 服务器配置#

  1. 双 Intel Xeon Gold 6336Y CPU @ 2.4G, 24核/48线程 (185W)

  2. 512GB RDIMM, 3200MT/s

  3. 1.92TB, 企业级 NVMe

  4. 扩展卡配置 2, 全长,4x16, 2x8 插槽 (PCIe gen 4)

  5. 双路热插拔冗余电源 (1+1), 1400W 或 2400W

  6. GPU 启用

BF3 网卡安装#

R750 在插槽 2,3,6,7 支持 PCIe 4.0 x16,在插槽 4,5 支持 x8。按照下表安装 BF3 网卡,并确保 PCIe/GPU 电源线已正确连接。这些是来自 戴尔 R750 安装手册 的 GPU 安装说明。

注意:仅使用主板上的 SIG_PWR_3 连接器用于 PCIe/GPU 供电。

网卡

插槽

PCIe/GPU 电源

NUMA

BF3

7 (扩展卡 4)

SIG_PWR_3

1

配置 BIOS 设置#

首次启动时,按以下顺序更改 BIOS 设置。相同的设置可以通过 BMC 更改:Configuration → BIOS Settings

集成设备:启用 4GB 以上内存映射 I/O,并将内存映射 I/O 基址更改为 12TB

../../_images/R750_BIOS_Integrated.png

系统配置文件设置:将系统配置文件更改为 Performance,将工作负载配置文件更改为 Low Latency Optimized Profile

../../_images/R750_BIOS_System.png

处理器设置:Aerial CUDA 加速 RAN 支持超线程模式(实验性)非超线程模式(默认),但请确保内核命令行和 cuPHYController YAML 中的 CPU 核心关联性与 BIOS 设置匹配。

要启用超线程,请启用逻辑处理器。要禁用超线程,请禁用逻辑处理器。

../../_images/R750_BIOS_Processor.png

保存 BIOS 设置,然后重启系统。

安装 Ubuntu 22.04 服务器版#

安装 Ubuntu 22.04 服务器版后,验证以下内容

使用以下命令确定网卡是否被操作系统检测到

$ lspci | grep -i mellanox
ca:00.0 Ethernet controller: Mellanox Technologies MT43244 BlueField-3 integrated ConnectX-7 network controller (rev 01)
ca:00.1 Ethernet controller: Mellanox Technologies MT43244 BlueField-3 integrated ConnectX-7 network controller (rev 01)
ca:00.2 DMA controller: Mellanox Technologies MT43244 BlueField-3 SoC Management Interface (rev 01)

禁用自动升级#

编辑 /etc/apt/apt.conf.d/20auto-upgrades 系统文件,并将两行中的 “1” 更改为 “0”。这可以防止已安装的低延迟内核版本在后续软件升级中被意外更改。

$ sudo nano /etc/apt/apt.conf.d/20auto-upgrades
APT::Periodic::Update-Package-Lists "0";
APT::Periodic::Unattended-Upgrade "0";

禁用 fwupd-refresh 定时器,以防止 fwupdmgr 自动检查任何更新。

$ sudo systemctl mask fwupd-refresh.timer

安装低延迟内核#

如果未安装低延迟内核,您必须删除旧内核,并仅保留最新的通用内核。输入以下命令列出已安装的内核

$ dpkg --list | grep -i 'linux-image' | awk '/ii/{ print $2}'

# To remove old kernel
$ sudo apt-get purge linux-image-<old kernel version>
$ sudo apt-get autoremove

安装发行清单中列出的特定版本的低延迟内核。

$ sudo apt-get update
$ sudo apt-get install -y linux-image-5.15.0-1042-nvidia-lowlatency

更新 GRUB 以更改默认启动内核

# Update grub to change the default boot kernel
$ sudo sed -i 's/^GRUB_DEFAULT=.*/GRUB_DEFAULT="Advanced options for Ubuntu>Ubuntu, with Linux 5.15.0-1042-nvidia-lowlatency"/' /etc/default/grub

配置 Linux 内核命令行#

要设置内核命令行参数,请编辑 GRUB 文件 /etc/default/grub 中的 GRUB_CMDLINE_LINUX_DEFAULT 参数,并附加/更新下面描述的参数。以下内核参数针对 Xeon Gold 6336Y CPU 和 512GB 内存进行了优化。

要使用这些更改自动附加 GRUB 文件,请输入此命令

# When HyperThread is disabled (default)
$ sudo sed -i 's/^GRUB_CMDLINE_LINUX_DEFAULT="[^"]*/& pci=realloc=off default_hugepagesz=1G hugepagesz=1G hugepages=16 tsc=reliable clocksource=tsc intel_idle.max_cstate=0 mce=ignore_ce processor.max_cstate=0 intel_pstate=disable audit=0 idle=poll rcu_nocb_poll nosoftlockup iommu=off irqaffinity=0-3 isolcpus=managed_irq,domain,4-47 nohz_full=4-47 rcu_nocbs=4-47 noht numa_balancing=disable/' /etc/default/grub

# When HyperThread is enabled (experimental)
$ sudo sed -i 's/^GRUB_CMDLINE_LINUX_DEFAULT="[^"]*/& pci=realloc=off default_hugepagesz=1G hugepagesz=1G hugepages=16 tsc=reliable clocksource=tsc intel_idle.max_cstate=0 mce=ignore_ce processor.max_cstate=0 intel_pstate=disable audit=0 idle=poll rcu_nocb_poll nosoftlockup iommu=off irqaffinity=0-3 isolcpus=managed_irq,domain,4-95 nohz_full=4-95 rcu_nocbs=4-95 numa_balancing=disable/' /etc/default/grub

与 CPU 核心相关的参数必须根据系统上 CPU 核心的数量进行调整。在上面的示例中,“4-47” 值表示 CPU 核心编号 4 到 47;您可能需要根据硬件配置调整此参数。默认情况下,仅使用一个 DPDK 线程。隔离的 CPU 由整个 cuBB 软件堆栈使用。使用 nproc --all 命令查看有多少核心可用。不要使用超出可用核心数量的核心编号。

警告

这些说明特定于 Canonical 提供的带有 5.15 低延迟内核的 Ubuntu 22.04。确保此处提供的内核命令适用于您的操作系统和内核版本,并在必要时修改这些设置以匹配您的系统。

应用更改并重启以加载内核#

$ sudo update-grub
$ sudo reboot

重启后,输入以下命令以验证系统是否已启动到低延迟内核

$ uname -r
5.15.0-1042-nvidia-lowlatency

输入此命令以验证内核命令行参数是否已正确配置

$ cat /proc/cmdline
BOOT_IMAGE=/vmlinuz-5.15.0-1042-nvidia-lowlatency root=/dev/mapper/ubuntu--vg-ubuntu--lv ro pci=realloc=off default_hugepagesz=1G hugepagesz=1G hugepages=16 tsc=reliable clocksource=tsc intel_idle.max_cstate=0 mce=ignore_ce processor.max_cstate=0 intel_pstate=disable audit=0 idle=poll rcu_nocb_poll nosoftlockup iommu=off irqaffinity=0-3 isolcpus=managed_irq,domain,4-47 nohz_full=4-47 rcu_nocbs=4-47 noht numa_balancing=disable

输入此命令以验证是否启用了巨页

$ grep -i huge /proc/meminfo
AnonHugePages:         0 kB
ShmemHugePages:        0 kB
FileHugePages:         0 kB
HugePages_Total:      16
HugePages_Free:       16
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:    1048576 kB
Hugetlb:        16777216 kB

禁用 Nouveau#

输入此命令以禁用 nouveau

$ cat <<EOF | sudo tee /etc/modprobe.d/blacklist-nouveau.conf
blacklist nouveau
options nouveau modeset=0
EOF

重新生成内核 initramfs 并重启系统

$ sudo update-initramfs -u
$ sudo reboot

安装依赖包#

输入这些命令以安装先决条件包

$ sudo apt-get update
$ sudo apt-get install -y build-essential linux-headers-$(uname -r) dkms unzip linuxptp pv

在主机上安装 RSHIM 和 Mellanox 固件工具#

注意

自 23-4 版本以来,Aerial 一直使用 Mellanox 内置驱动程序而不是 MOFED。如果系统上安装了 MOFED,则必须将其删除。

检查主机系统上是否已安装 MOFED。

$ ofed_info -s
MLNX_OFED_LINUX-23.07-0.5.0.0:

如果存在 MOFED,请卸载它。

$ sudo /usr/sbin/ofed_uninstall.sh

输入以下命令安装 rshim 驱动程序。

# Install rshim
$ wget https://developer.nvidia.com/downloads/networking/secure/doca-sdk/DOCA_2.7/doca-host_2.7.0-209000-24.04-ubuntu2204_amd64.deb
$ sudo dpkg -i doca-host_2.7.0-209000-24.04-ubuntu2204_amd64.deb
$ sudo apt-get update
$ sudo apt install rshim

输入以下命令安装 Mellanox 固件工具。

# Install Mellanox Firmware Tools
$ export MFT_VERSION=4.28.0-92
$ wget https://www.mellanox.com/downloads/MFT/mft-$MFT_VERSION-x86_64-deb.tgz
$ tar xvf mft-$MFT_VERSION-x86_64-deb.tgz
$ sudo mft-$MFT_VERSION-x86_64-deb/install.sh

# Verify the install Mellanox firmware tool version
$ sudo mst version
mst, mft 4.28.0-92, built on Apr 25 2024, 15:22:58. Git SHA Hash: N/A

$ sudo mst start

# check NIC PCIe bus addresses and network interface names
$ sudo mst status -v

# Here is the result of GPU#1 on slot 7
MST modules:
------------
    MST PCI module is not loaded
    MST PCI configuration module loaded
PCI devices:
------------
DEVICE_TYPE             MST                           PCI       RDMA            NET                                     NUMA
BlueField3(rev:1)       /dev/mst/mt41692_pciconf0.1   ca:00.1   mlx5_1          net-aerial01                            1

BlueField3(rev:1)       /dev/mst/mt41692_pciconf0     ca:00.0   mlx5_0          net-aerial00                            1

输入这些命令以检查端口 0 的链路状态

# Here is an example if port 0 is connected to another server via a 200GbE DAC cable.

$ sudo mlxlink -d /dev/mst/mt41692_pciconf0

Operational Info
----------------
State                              : Active
Physical state                     : LinkUp
Speed                              : 200G
Width                              : 4x
FEC                                : Standard_RS-FEC - (544,514)
Loopback Mode                      : No Loopback
Auto Negotiation                   : ON

Supported Info
--------------
Enabled Link Speed (Ext.)          : 0x00003ff2 (200G_2X,200G_4X,100G_1X,100G_2X,100G_4X,50G_1X,50G_2X,40G,25G,10G,1G)
Supported Cable Speed (Ext.)       : 0x000017f2 (200G_4X,100G_2X,100G_4X,50G_1X,50G_2X,40G,25G,10G,1G)

Troubleshooting Info
--------------------
Status Opcode                      : 0
Group Opcode                       : N/A
Recommendation                     : No issue was observed

Tool Information
----------------
Firmware Version                   : 32.41.1000
amBER Version                      : 3.2
MFT Version                        : mft 4.28.0-92

安装 Docker CE#

安装 Docker CE 的完整官方说明可以在 Docker 网站上找到:https://docs.docker.net.cn/engine/install/ubuntu/#install-docker-engine。以下说明是安装 Docker CE 的一种受支持的方式

警告

为了正确工作,CUDA 驱动程序必须在安装 Docker CE 或 nvidia-container-toolkit 之前安装。建议您在安装 Docker CE 或 nvidia-container-toolkit 之前安装 CUDA 驱动程序。

$ sudo apt-get update
$ sudo apt-get install -y ca-certificates curl gnupg
$ sudo install -m 0755 -d /etc/apt/keyrings
$ curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
$ sudo chmod a+r /etc/apt/keyrings/docker.gpg
$ echo \
    "deb [arch="$(dpkg --print-architecture)" signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu \
    "$(. /etc/os-release && echo "$VERSION_CODENAME")" stable" | \
    sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
$ sudo apt-get update
$ sudo apt-get install -y docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
$ sudo docker run --rm hello-world

更新 BF3 BFB 镜像和网卡固件#

注意

  • 以下说明专门针对 BF3 网卡(OPN: 900-9D3B6-00CV-A; PSID: MT_0000000884)。

  • 如果使用下面的 BFB 镜像,则无需切换到 DPU 模式。

  • 此 BFB 镜像将自动更新网卡固件。

# Enable MST
$ sudo mst start
$ sudo mst status

MST modules:
------------
    MST PCI module is not loaded
    MST PCI configuration module loaded

MST devices:
------------
/dev/mst/mt41692_pciconf0        - PCI configuration cycles access.
                                domain:bus:dev.fn=0000:ca:00.0 addr.reg=88 data.reg=92 cr_bar.gw_offset=-1
                                Chip revision is: 01


# Download the BF3 BFB image
$ wget https://content.mellanox.com/BlueField/BFBs/Ubuntu22.04/bf-bundle-2.7.0-33_24.04_ubuntu-22.04_prod.bfb
# Here is the command to flash BFB image. NOTE: If there are multiple BF3 NICs, repeat the same command with rshim<0..N-1>. N is the number of BF3 NICs.
$ sudo bfb-install -r rshim0 -b bf-bundle-2.7.0-33_24.04_ubuntu-22.04_prod.bfb

Pushing bfb
1.41GiB 0:01:24 [17.1MiB/s] [                                                                                         <=>]
Collecting BlueField booting status. Press Ctrl+C to stop…
INFO[PSC]: PSC BL1 START
INFO[BL2]: start
INFO[BL2]: boot mode (rshim)
INFO[BL2]: VDDQ adjustment complete
INFO[BL2]: VDDQ: 1120 mV
INFO[BL2]: DDR POST passed
INFO[BL2]: UEFI loaded
INFO[BL31]: start
INFO[BL31]: lifecycle GA Secured
INFO[BL31]: VDD: 851 mV
ERR[BL31]: MB timeout
INFO[BL31]: runtime
INFO[UEFI]: eMMC init
INFO[UEFI]: eMMC probed
INFO[UEFI]: UPVS valid
INFO[UEFI]: PMI: updates started
INFO[UEFI]: PMI: total updates: 1
INFO[UEFI]: PMI: updates completed, status 0
INFO[UEFI]: PCIe enum start
INFO[UEFI]: PCIe enum end
INFO[UEFI]: UEFI Secure Boot (enabled)
INFO[UEFI]: Redfish enabled
INFO[BL31]: Partial NIC
INFO[BL31]: power capping disabled
INFO[UEFI]: exit Boot Service
INFO[MISC]: Ubuntu installation started
INFO[MISC]: Installing OS image
INFO[MISC]: Ubuntu installation completed
WARN[MISC]: Skipping BMC components upgrade.
INFO[MISC]: Updating NIC firmware...
INFO[MISC]: NIC firmware update done
INFO[MISC]: Installation finished

# Wait 10 minutes to ensure the card initializes properly after the BFB installation
$ sleep 600

# NOTE: Requires a full power cycle from host with cold boot

# Verify NIC FW version after reboot
$ sudo mst start
$ sudo flint -d /dev/mst/mt41692_pciconf0 q
Image type:            FS4
FW Version:            32.41.1000
FW Release Date:       28.4.2024
Product Version:       32.41.1000
Rom Info:              type=UEFI Virtio net version=21.4.13 cpu=AMD64,AARCH64
                       type=UEFI Virtio blk version=22.4.13 cpu=AMD64,AARCH64
                       type=UEFI version=14.34.12 cpu=AMD64,AARCH64
                       type=PXE version=3.7.400 cpu=AMD64
Description:           UID                GuidsNumber
Base GUID:             946dae0300f5aa8e        38
Base MAC:              946daef5aa8e            38
Image VSD:             N/A
Device VSD:            N/A
PSID:                  MT_0000000884
Security Attributes:   secure-fw

运行以下命令配置 BF3 网卡

# Setting BF3 port to Ethernet mode (not Infiniband)
$ sudo mlxconfig -d /dev/mst/mt41692_pciconf0 --yes set LINK_TYPE_P1=2
$ sudo mlxconfig -d /dev/mst/mt41692_pciconf0 --yes set LINK_TYPE_P2=2

$ sudo mlxconfig -d /dev/mst/mt41692_pciconf0 --yes set INTERNAL_CPU_MODEL=1
$ sudo mlxconfig -d /dev/mst/mt41692_pciconf0 --yes set INTERNAL_CPU_PAGE_SUPPLIER=EXT_HOST_PF
$ sudo mlxconfig -d /dev/mst/mt41692_pciconf0 --yes set INTERNAL_CPU_ESWITCH_MANAGER=EXT_HOST_PF
$ sudo mlxconfig -d /dev/mst/mt41692_pciconf0 --yes set INTERNAL_CPU_IB_VPORT0=EXT_HOST_PF
$ sudo mlxconfig -d /dev/mst/mt41692_pciconf0 --yes set INTERNAL_CPU_OFFLOAD_ENGINE=DISABLED

$ sudo mlxconfig -d /dev/mst/mt41692_pciconf0 --yes set CQE_COMPRESSION=1
$ sudo mlxconfig -d /dev/mst/mt41692_pciconf0 --yes set PROG_PARSE_GRAPH=1
$ sudo mlxconfig -d /dev/mst/mt41692_pciconf0 --yes set ACCURATE_TX_SCHEDULER=1
$ sudo mlxconfig -d /dev/mst/mt41692_pciconf0 --yes set FLEX_PARSER_PROFILE_ENABLE=4
$ sudo mlxconfig -d /dev/mst/mt41692_pciconf0 --yes set REAL_TIME_CLOCK_ENABLE=1

$ sudo mlxconfig -d /dev/mst/mt41692_pciconf0 --yes set EXP_ROM_VIRTIO_NET_PXE_ENABLE=0
$ sudo mlxconfig -d /dev/mst/mt41692_pciconf0 --yes set EXP_ROM_VIRTIO_NET_UEFI_ARM_ENABLE=0
$ sudo mlxconfig -d /dev/mst/mt41692_pciconf0 --yes set EXP_ROM_VIRTIO_NET_UEFI_x86_ENABLE=0
$ sudo mlxconfig -d /dev/mst/mt41692_pciconf0 --yes set EXP_ROM_VIRTIO_BLK_UEFI_ARM_ENABLE=0
$ sudo mlxconfig -d /dev/mst/mt41692_pciconf0 --yes set EXP_ROM_VIRTIO_BLK_UEFI_x86_ENABLE=0

# NOTE: Requires a full power cycle from host with cold boot

# Verify that the NIC FW changes have been applied
$ sudo mlxconfig -d /dev/mst/mt41692_pciconf0 q | grep "CQE_COMPRESSION\|PROG_PARSE_GRAPH\|ACCURATE_TX_SCHEDULER\|FLEX_PARSER_PROFILE_ENABLE\|REAL_TIME_CLOCK_ENABLE\|INTERNAL_CPU_MODEL\|LINK_TYPE_P1\|LINK_TYPE_P2\|INTERNAL_CPU_PAGE_SUPPLIER\|INTERNAL_CPU_ESWITCH_MANAGER\|INTERNAL_CPU_IB_VPORT0\|INTERNAL_CPU_OFFLOAD_ENGINE"
        INTERNAL_CPU_MODEL                  EMBEDDED_CPU(1)
        INTERNAL_CPU_PAGE_SUPPLIER          EXT_HOST_PF(1)
        INTERNAL_CPU_ESWITCH_MANAGER        EXT_HOST_PF(1)
        INTERNAL_CPU_IB_VPORT0              EXT_HOST_PF(1)
        INTERNAL_CPU_OFFLOAD_ENGINE         DISABLED(1)
        FLEX_PARSER_PROFILE_ENABLE          4
        PROG_PARSE_GRAPH                    True(1)
        ACCURATE_TX_SCHEDULER               True(1)
        CQE_COMPRESSION                     AGGRESSIVE(1)
        REAL_TIME_CLOCK_ENABLE              True(1)
        LINK_TYPE_P1                        ETH(2)
        LINK_TYPE_P2                        ETH(2)

设置持久性网卡接口名称#

配置网络链接文件,以便网卡接口始终以相同的名称出现。运行 lshw -c network -businfo 以查找目标总线地址上的当前接口名称,然后运行 ip link 以通过接口名称查找相应的 MAC 地址。识别 MAC 地址后,使用以下信息在 /etc/systemd/network/NN-persistent-net.link 创建文件

[Match]
MACAddress={{item.mac}}

[Link]
Name={{item.name}}

以下网络链接文件将融合加速器端口 #0 设置为 aerial00,端口 #1 设置为 aerial01

$ sudo nano /etc/systemd/network/11-persistent-net.link

# Update the MAC address to match the converged accelerator port 0 MAC address
[Match]
MACAddress=48:b0:2d:xx:xx:xx

[Link]
Name=aerial00

$ sudo nano /etc/systemd/network/12-persistent-net.link

# Update the MAC address to match the converged accelerator port 1 MAC address
[Match]
MACAddress=48:b0:2d:yy:yy:yy

[Link]
Name=aerial01

创建这些文件后重启系统。

安装 ptp4l 和 phc2sys#

输入这些命令以配置 PTP4L,假设 aerial00 网卡接口和 CPU 核心 41 用于 PTP

$ cat <<EOF | sudo tee /etc/ptp.conf
[global]
dataset_comparison              G.8275.x
G.8275.defaultDS.localPriority  128
maxStepsRemoved                 255
logAnnounceInterval             -3
logSyncInterval                 -4
logMinDelayReqInterval          -4
G.8275.portDS.localPriority     128
network_transport               L2
domainNumber                    24
tx_timestamp_timeout            30
# When used as an RU and PTP master, set slaveOnly to 0
slaveOnly 0

clock_servo pi
step_threshold 1.0
egressLatency 28
pi_proportional_const 4.65
pi_integral_const 0.1

[aerial00]
announceReceiptTimeout 3
delay_mechanism E2E
network_transport L2
EOF

cat <<EOF | sudo tee /lib/systemd/system/ptp4l.service
[Unit]
Description=Precision Time Protocol (PTP) service
Documentation=man:ptp4l
After=network.target

[Service]
Restart=always
RestartSec=5s
Type=simple
ExecStartPre=ifconfig aerial00 up
ExecStartPre=ethtool --set-priv-flags aerial00 tx_port_ts on
ExecStartPre=ethtool -A aerial00 rx off tx off
ExecStartPre=ifconfig aerial01 up
ExecStartPre=ethtool --set-priv-flags aerial01 tx_port_ts on
ExecStartPre=ethtool -A aerial01 rx off tx off
ExecStart=taskset -c 41 /usr/sbin/ptp4l -f /etc/ptp.conf

[Install]
WantedBy=multi-user.target
EOF

$ sudo systemctl daemon-reload
$ sudo systemctl restart ptp4l.service
$ sudo systemctl enable ptp4l.service

一台服务器成为主时钟,如下所示

$ sudo systemctl status ptp4l.service

• ptp4l.service - Precision Time Protocol (PTP) service
     Loaded: loaded (/lib/systemd/system/ptp4l.service; enabled; vendor preset: enabled)
     Active: active (running) since Tue 2023-08-08 19:37:56 UTC; 2 weeks 3 days ago
       Docs: man:ptp4l
   Main PID: 1120 (ptp4l)
      Tasks: 1 (limit: 94533)
     Memory: 460.0K
        CPU: 9min 8.089s
     CGroup: /system.slice/ptp4l.service
             └─1120 /usr/sbin/ptp4l -f /etc/ptp.conf

Aug 09 18:12:35 aerial-devkit taskset[1120]: ptp4l[81287.043]: selected local clock b8cef6.fffe.d333be as best master
Aug 09 18:12:35 aerial-devkit taskset[1120]: ptp4l[81287.043]: port 1: assuming the grand master role
Aug 11 20:44:51 aerial-devkit taskset[1120]: ptp4l[263223.379]: timed out while polling for tx timestamp
Aug 11 20:44:51 aerial-devkit taskset[1120]: ptp4l[263223.379]: increasing tx_timestamp_timeout may correct this issue, but it is likely caused by a driver bug
Aug 11 20:44:51 aerial-devkit taskset[1120]: ptp4l[263223.379]: port 1: send sync failed
Aug 11 20:44:51 aerial-devkit taskset[1120]: ptp4l[263223.379]: port 1: MASTER to FAULTY on FAULT_DETECTED (FT_UNSPECIFIED)
Aug 11 20:45:07 aerial-devkit taskset[1120]: ptp4l[263239.522]: port 1: FAULTY to LISTENING on INIT_COMPLETE
Aug 11 20:45:08 aerial-devkit taskset[1120]: ptp4l[263239.963]: port 1: LISTENING to MASTER on ANNOUNCE_RECEIPT_TIMEOUT_EXPIRES
Aug 11 20:45:08 aerial-devkit taskset[1120]: ptp4l[263239.963]: selected local clock b8cef6.fffe.d333be as best master
Aug 11 20:45:08 aerial-devkit taskset[1120]: ptp4l[263239.963]: port 1: assuming the grand master role

另一台成为辅助、跟随者时钟,如下所示

$ sudo systemctl status ptp4l.service

• ptp4l.service - Precision Time Protocol (PTP) service
     Loaded: loaded (/lib/systemd/system/ptp4l.service; enabled; vendor preset: enabled)
     Active: active (running) since Tue 2023-08-22 16:25:41 UTC; 3 days ago
       Docs: man:ptp4l
   Main PID: 3251 (ptp4l)
      Tasks: 1 (limit: 598810)
     Memory: 472.0K
        CPU: 2min 48.984s
     CGroup: /system.slice/ptp4l.service
             └─3251 /usr/sbin/ptp4l -f /etc/ptp.conf

Aug 25 19:58:34 aerial-r750 taskset[3251]: ptp4l[272004.187]: rms    8 max   15 freq -14495 +/-   9 delay    11 +/-   0
Aug 25 19:58:35 aerial-r750 taskset[3251]: ptp4l[272005.187]: rms    6 max   12 freq -14480 +/-   7 delay    11 +/-   1
Aug 25 19:58:36 aerial-r750 taskset[3251]: ptp4l[272006.187]: rms    8 max   12 freq -14465 +/-   5 delay    10 +/-   0
Aug 25 19:58:37 aerial-r750 taskset[3251]: ptp4l[272007.187]: rms   11 max   18 freq -14495 +/-  10 delay    11 +/-   1
Aug 25 19:58:38 aerial-r750 taskset[3251]: ptp4l[272008.187]: rms   12 max   21 freq -14515 +/-   7 delay    12 +/-   1
Aug 25 19:58:39 aerial-r750 taskset[3251]: ptp4l[272009.187]: rms    7 max   12 freq -14488 +/-   7 delay    12 +/-   1
Aug 25 19:58:40 aerial-r750 taskset[3251]: ptp4l[272010.187]: rms    7 max   12 freq -14479 +/-   7 delay    11 +/-   1
Aug 25 19:58:41 aerial-r750 taskset[3251]: ptp4l[272011.187]: rms   10 max   20 freq -14503 +/-  11 delay    11 +/-   1
Aug 25 19:58:42 aerial-r750 taskset[3251]: ptp4l[272012.188]: rms   10 max   20 freq -14520 +/-   7 delay    13 +/-   1
Aug 25 19:58:43 aerial-r750 taskset[3251]: ptp4l[272013.188]: rms    2 max    7 freq -14510 +/-   4 delay    12 +/-   1

输入命令以关闭 NTP

$ sudo timedatectl set-ntp false
$ timedatectl
Local time: Thu 2022-02-03 22:30:58 UTC
           Universal time: Thu 2022-02-03 22:30:58 UTC
                 RTC time: Thu 2022-02-03 22:30:58
                Time zone: Etc/UTC (UTC, +0000)
System clock synchronized: no
              NTP service: inactive
          RTC in local TZ: no

以服务形式运行 PHC2SYS

PHC2SYS 用于将系统时钟同步到网卡上的 PTP 硬件时钟 (PHC)。

将用于 PTP 的网络接口和系统时钟指定为从时钟。

# If more than one instance is already running, kill the existing
# PHC2SYS sessions.

# Command used can be found in /lib/systemd/system/phc2sys.service
# Update the ExecStart line to the following
$ cat <<EOF | sudo tee /lib/systemd/system/phc2sys.service
[Unit]
Description=Synchronize system clock or PTP hardware clock (PHC)
Documentation=man:phc2sys
After=ntpdate.service
Requires=ptp4l.service
After=ptp4l.service

[Service]
Restart=always
RestartSec=5s
Type=simple
# Gives ptp4l a chance to stabilize
ExecStartPre=sleep 2
ExecStart=/bin/sh -c "taskset -c 41 /usr/sbin/phc2sys -s /dev/ptp$(ethtool -T aerial00| grep PTP | awk '{print $4}') -c CLOCK_REALTIME -n 24 -O 0 -R 256 -u 256"

[Install]
WantedBy=multi-user.target
EOF

更改 PHC2SYS 配置文件后,运行以下命令

$ sudo systemctl daemon-reload
$ sudo systemctl restart phc2sys.service

# Set to start automatically on reboot
$ sudo systemctl enable phc2sys.service

# check that the service is active and has converged to a low rms value (<30) and that the correct NIC has been selected (aerial00):
$ sudo systemctl status phc2sys.service
● phc2sys.service - Synchronize system clock or PTP hardware clock (PHC)
     Loaded: loaded (/lib/systemd/system/phc2sys.service; enabled; vendor preset: enabled)
     Active: active (running) since Fri 2023-02-17 17:02:35 UTC; 7s ago
       Docs: man:phc2sys
   Main PID: 2225556 (phc2sys)
      Tasks: 1 (limit: 598864)
     Memory: 372.0K
     CGroup: /system.slice/phc2sys.service
             └─2225556 /usr/sbin/phc2sys -a -r -n 24 -R 256 -u 256

Feb 17 17:02:35 aerial-devkit phc2sys[2225556]: [1992363.445] reconfiguring after port state change
Feb 17 17:02:35 aerial-devkit phc2sys[2225556]: [1992363.445] selecting CLOCK_REALTIME for synchronization
Feb 17 17:02:35 aerial-devkit phc2sys[2225556]: [1992363.445] selecting aerial00 as the master clock
Feb 17 17:02:36 aerial-devkit phc2sys[2225556]: [1992364.457] CLOCK_REALTIME rms   15 max   37 freq -19885 +/- 116 delay  1944 +/-   6
Feb 17 17:02:37 aerial-devkit phc2sys[2225556]: [1992365.473] CLOCK_REALTIME rms   16 max   42 freq -19951 +/- 103 delay  1944 +/-   7
Feb 17 17:02:38 aerial-devkit phc2sys[2225556]: [1992366.490] CLOCK_REALTIME rms   13 max   31 freq -19909 +/-  81 delay  1944 +/-   6
Feb 17 17:02:39 aerial-devkit phc2sys[2225556]: [1992367.506] CLOCK_REALTIME rms    9 max   27 freq -19918 +/-  40 delay  1945 +/-   6
Feb 17 17:02:40 aerial-devkit phc2sys[2225556]: [1992368.522] CLOCK_REALTIME rms    8 max   24 freq -19925 +/-  11 delay  1945 +/-   9
Feb 17 17:02:41 aerial-devkit phc2sys[2225556]: [1992369.538] CLOCK_REALTIME rms    9 max   23 freq -19915 +/-  36 delay  1943 +/-   8

验证系统时钟是否已同步

$ timedatectl
Local time: Thu 2022-02-03 22:30:58 UTC
           Universal time: Thu 2022-02-03 22:30:58 UTC
                 RTC time: Thu 2022-02-03 22:30:58
                Time zone: Etc/UTC (UTC, +0000)
System clock synchronized: yes
              NTP service: inactive
          RTC in local TZ: no

设置启动配置服务#

创建 /usr/local/bin 目录并创建 /usr/local/bin/nvidia.sh 文件,以便在每次重启时运行命令。“nvidia-smi lgc” 命令仅期望一个 GPU 设备 (-i 0)。如果系统使用多个 GPU,则需要修改此项。

$ cat <<"EOF" | sudo tee /usr/local/bin/nvidia.sh
#!/bin/bash

mst start

nvidia-smi -i 0 -lgc $(nvidia-smi -i 0 --query-supported-clocks=graphics --format=csv,noheader,nounits | sort -h | tail -n 1)
nvidia-smi -mig 0

echo -1 > /proc/sys/kernel/sched_rt_runtime_us
EOF

创建一个系统服务文件,以便在网络接口启动后加载。

$ cat <<EOF | sudo tee /lib/systemd/system/nvidia.service
[Unit]
After=network.target

[Service]
ExecStart=/usr/local/bin/nvidia.sh

[Install]
WantedBy=default.target
EOF

为 nvidia-persistenced 创建一个系统服务文件,以便在启动时运行。

注意

此文件是按照 /usr/share/doc/NVIDIA_GLX-1.0/samples/nvidia-persistenced-init.tar.bz2 中的示例创建的。

cat <<EOF | sudo tee /lib/systemd/system/nvidia-persistenced.service
[Unit]
Description=NVIDIA Persistence Daemon
Wants=syslog.target

[Service]
Type=forking
ExecStart=/usr/bin/nvidia-persistenced
ExecStopPost=/bin/rm -rf /var/run/nvidia-persistenced

[Install]
WantedBy=multi-user.target
EOF

然后设置文件权限,重新加载 systemd 守护程序,启用服务,首次安装时重启服务,并检查状态

sudo chmod 744 /usr/local/bin/nvidia.sh
sudo chmod 664 /lib/systemd/system/nvidia.service
sudo chmod 664 /lib/systemd/system/nvidia-persistenced.service
sudo systemctl daemon-reload
sudo systemctl enable nvidia-persistenced.service
sudo systemctl enable nvidia.service
sudo systemctl restart nvidia.service
sudo systemctl restart nvidia-persistenced.service
sudo systemctl status nvidia.service
sudo systemctl status nvidia-persistenced.service

最后一个命令的输出应如下所示

aerial@server:~$ sudo systemctl status nvidia.service
○ nvidia.service
     Loaded: loaded (/lib/systemd/system/nvidia.service; enabled; vendor preset: enabled)
     Active: inactive (dead) since Fri 2024-06-07 20:26:06 UTC; 2s ago
    Process: 251860 ExecStart=/usr/local/bin/nvidia.sh (code=exited, status=0/SUCCESS)
   Main PID: 251860 (code=exited, status=0/SUCCESS)
        CPU: 788ms

Jun 07 20:26:05 server nvidia.sh[251862]: Starting MST (Mellanox Software Tools) driver set
Jun 07 20:26:05 server nvidia.sh[251862]: Loading MST PCI module - Success
Jun 07 20:26:05 server nvidia.sh[251862]: [warn] mst_pciconf is already loaded, skipping
Jun 07 20:26:05 server nvidia.sh[251862]: Create devices
Jun 07 20:26:06 server nvidia.sh[251862]: Unloading MST PCI module (unused) - Success
Jun 07 20:26:06 server nvidia.sh[252732]: GPU clocks set to "(gpuClkMin 1410, gpuClkMax 1410)" for GPU 00000000:CF:00.0
Jun 07 20:26:06 server nvidia.sh[252732]: All done.
Jun 07 20:26:06 server nvidia.sh[252733]: Disabled MIG Mode for GPU 00000000:CF:00.0
Jun 07 20:26:06 server nvidia.sh[252733]: All done.
Jun 07 20:26:06 server systemd[1]: nvidia.service: Deactivated successfully.

aerial@server:~$ sudo systemctl status nvidia-persistenced.service
● nvidia-persistenced.service - NVIDIA Persistence Daemon
     Loaded: loaded (/lib/systemd/system/nvidia-persistenced.service; enabled; vendor preset: enabled)
     Active: active (running) since Fri 2024-06-07 20:25:57 UTC; 3s ago
    Process: 251836 ExecStart=/usr/bin/nvidia-persistenced (code=exited, status=0/SUCCESS)
   Main PID: 251837 (nvidia-persiste)
      Tasks: 1 (limit: 598792)
     Memory: 672.0K
        CPU: 9ms
     CGroup: /system.slice/nvidia-persistenced.service
             └─251837 /usr/bin/nvidia-persistenced

Jun 07 20:25:57 server systemd[1]: Starting NVIDIA Persistence Daemon...
Jun 07 20:25:57 server nvidia-persistenced[251837]: Started (251837)
Jun 07 20:25:57 server systemd[1]: Started NVIDIA Persistence Daemon.