Slurm 节点设置#

本节介绍在 BCM 头节点上执行的配置步骤。

  1. 首先,为 slogin 节点创建软件镜像和类别。

    cmsh
    
    [bcm10-headnode]% softwareimage
    [bcm10-headnode->softwareimage]% clone default-image slogin-image
    [bcm10-headnode->softwareimage*[slogin-image*]]% commit
    [bcm10-headnode->softwareimage[slogin-image]]% category
    [bcm10-headnode->category]% clone default slogin
    [bcm10-headnode->category*[slogin*]]% set softwareimage slogin-image
    [bcm10-headnode->category*[slogin*]]% commit
    
  2. 现在创建第一个 slogin 节点。

    cmsh
    
    [bcm10-headnode]% device
    [bcm10-headnode->device]% clone node001 slogin-01
    [bcm10-headnode->device*[slogin-01*]]% set category slogin
    [bcm10-headnode->device*[slogin-01*]]% commit
    
  3. 添加并配置接口。

    注意

    接口的名称将根据节点的硬件供应商而变化。

    cmsh
    
    [bcm10-headnode]% device
    [bcm10-headnode->device]% use slogin-01
    [bcm10-headnode->device[slogin-01]]% interfaces
    [bcm10-headnode->device[slogin-01]->interfaces]% add bmc ipmi0 10.133.10.21 ipminet
    [bcm10-headnode->device*[slogin-01*]->interfaces*[ipmi0*]]% commit
    [bcm10-headnode->device[slogin-01]->interfaces]% add physical enp65s0np0; add physical ens1f1np1
    [bcm10-headnode->device*[slogin-01*]->interfaces*[ens1f1np1*]]% commit
    [bcm10-headnode->device[slogin-01]->interfaces]% add bond bond0 10.133.11.21 internalnet
    [bcm10-headnode->device*[slogin-01*]->interfaces*[bond0*]]% append interfaces enp65s0np0 ens1f1np1
    [bcm10-headnode->device*[slogin-01*]->interfaces*[bond0*]]% remove bootif
    [bcm10-headnode->device*[slogin-01*]->interfaces*[bond0*]]% ..
    [bcm10-headnode->device*[slogin-01*]->interfaces*]% ..
    [bcm10-headnode->device*[slogin-01*]]% set provisioninginterface bond0
    [bcm10-headnode->device*[slogin-01*]]% commit
    
  4. 克隆第一个 slogin 节点以添加其他管理节点。cpu-01 将用作其他管理节点的热备。

    cmsh
    
    [bcm10-headnode]% device
    [bcm10-headnode->device]% clone slogin-01 slogin-02 --next-ip
    [bcm10-headnode->device*[slogin-02*]]% commit
    [bcm10-headnode->device]% clone slogin-01 cpu-01 --next-ip
    [bcm10-headnode->device*[cpu-01*]]% set category default
    [bcm10-headnode->device*[cpu-01*]]% commit
    
  5. 设置每个节点的 MAC 地址,以便它们可以 PXE 启动。

    对 slogin-02 和 cpu-01 重复此步骤

    cmsh
    [bcm10-headnode]% device
    [bcm10-headnode->device]% use slogin-01; interfaces
    [bcm10-headnode->device[slogin-01]->interfaces]% set enp65s0np0 mac 00:00:00:00:00:01
    [bcm10-headnode->device*[slogin-01*]->interfaces*]% set ens1f1np1 mac 00:00:00:00:00:02
    [bcm10-headnode->device*[slogin-01*]->interfaces*]% ..
    [bcm10-headnode->device*[slogin-01*]]% set mac 00:00:00:00:00:01
    [bcm10-headnode->device*[slogin-01*]]% commit
    
  6. 开启电源并配置节点。