固件更新失败故障排除#
固件更新因未找到组件而终止#
当使用主板固件包执行 GPU 托盘的固件更新时,固件更新会停止并显示以下输出消息
...
{
"@odata.type": "#Message.v1_0_8.Message",
"Message": "Given PLDMBundle Status Message : Requested component was not found in the firmware bundle.",
"MessageArgs": [
"Requested component was not found in the firmware bundle."
],
"MessageId": "UpdateService.1.0.FwUpdateStatusMessage",
"Resolution": "None",
"Severity": "Warning"
},
...
该消息表明 nvfwupd
命令的 -p
参数指定的固件文件无效。重试更新并指定与组件匹配的固件文件。例如,对于 GPU 托盘更新,请使用包含 HGX
字符串的 GPU 固件文件。有关固件文件名和组件,请参阅版本 25.01.1。
未检测到句柄 ID 0 的设备#
当使用 Redfish API 执行固件更新时,以下输出消息表明 -F UpdateFile=
参数中指定的固件文件不是 JSON 文件中指定的组件的正确文件。
...
{
"@odata.type": "#Message.v1_0_8.Message",
"Message": "Given PLDMBundle Status Message : No devices where detected for handle id 0.",
"MessageArgs": [
"No devices where detected for handle id 0"
],
"MessageId": "UpdateService.1.0.FwUpdateStatusMessage",
"Resolution": "None",
"Severity": "Warning"
},
...
重试更新并指定与组件匹配的固件文件。有关使用 Redfish API 的信息,请参阅NVIDIA DGX B200 系统用户指南中的Redfish API 支持。
等待固件更新启动 ID#
使用 nvfwupd
命令进行不成功的固件更新的输出可能如下例所示
FW recipe: ['nvfw_DGXB200_xxxx_xxxxxx.x.x.fwpkg']
{"@odata.type": "#UpdateService.v1_6_0.UpdateService", "Messages": [{"@odata.type": "#Message.v1_0_8.Message", "Message": "A new task /redfish/v1/TaskService/Tasks/4 was created.", "MessageArgs": ["/redfish/v1/TaskService/Tasks/4"], "MessageId": "Task.1.0.New", "Resolution": "None", "Severity": "OK"}, {"@odata.type": "#Message.v1_0_8.Message", "Message": "The action UpdateService.MultipartPush was submitted to do firmware update.", "MessageArgs": ["UpdateService.MultipartPush"], "MessageId": "UpdateService.1.0.StartFirmwareUpdate", "Resolution": "None", "Severity": "OK"}]}
FW update started, Task Id: 4
Wait for FirmwareUpdateStarted Id in Messages
Wait for FirmwareUpdateStarted Id in Messages
Task Message: Task /redfish/v1/UpdateService/upload has stopped due to an exception condition.
Firmware update failed, retry the firmware update
如命令输出所示,重试固件更新。