API 参考#
OpenAPI 模式#
OpenAPI 规范 详细说明了 NVIDIA NIM for VLMs 的端点
/v1/models - 列出可用模型
/v1/health/ready - 健康检查
/v1/health/live - 服务活跃度检查
/v1/chat/completions - 兼容 OpenAI 的聊天端点
/inference/chat_completion - 兼容 Llama Stack 的聊天端点
API 示例#
使用本节中的示例帮助您开始使用 API。
列出模型#
cURL 请求
使用以下命令列出可用模型。
curl -X 'GET' 'http://0.0.0.0:8000/v1/models'
响应
{
"object": "list",
"data": [
{
"id": "meta/llama-3.2-11b-vision-instruct",
"object": "model",
"created": 1724796510,
"owned_by": "system",
"root": "meta/llama-3.2-11b-vision-instruct",
"parent": null,
"max_model_len": 131072,
"permission": [
{
"id": "modelperm-c2e069f426cc43088eb408f388578289",
"object": "model_permission",
"created": 1724796510,
"allow_create_engine": false,
"allow_sampling": true,
"allow_logprobs": true,
"allow_search_indices": false,
"allow_view": true,
"allow_fine_tuning": false,
"organization": "*",
"group": null,
"is_blocking": false
}
]
}
]
}
检查健康状况#
使用以下命令检查服务器健康状况。
cURL 请求
curl -X 'GET' 'http://0.0.0.0:8000/v1/health/ready'
响应
{
"object": "health.response",
"message": "Service is ready."
}
检查服务活跃度#
使用以下命令检查服务活跃度。
cURL 请求
curl -X 'GET' 'http://0.0.0.0:8000/v1/health/live'
响应
{
"object": "readyhealth.response",
"message": "Service is live."
}
OpenAI 聊天补全#
使用以下命令查询 OpenAI 聊天补全端点。
cURL 请求
curl -X 'POST' \
'http://0.0.0.0:8000/v1/chat/completions' \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-d '{
"model": "meta/llama-3.2-11b-vision-instruct",
"messages": [
{
"role":"user",
"content": [
{
"type": "text",
"text": "What is in this image?"
},
{
"type": "image_url",
"image_url":
{
"url": "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg"
}
}
]
}
],
"max_tokens": 256
}'
响应
{
"id": "chat-8c5f5115fa464ab593963d5764498350",
"object": "chat.completion",
"created": 1729020253,
"model": "meta/llama-3.2-11b-vision-instruct",
"choices": [
{
"index": 0,
"message": {
"role": "assistant"
"content": "This image shows a boardwalk in a field of tall grass. ..."
},
"logprobs": null,
"finish_reason": "stop",
"stop_reason": null
}
],
"usage": {
"prompt_tokens": 17,
"total_tokens": 138,
"completion_tokens": 121
},
"prompt_logprobs": null
}
Llama Stack 聊天补全#
使用以下命令查询 Llama Stack 聊天补全端点。
cURL 请求
curl -X 'POST' \
'http://0.0.0.0:8000/inference/chat_completion' \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-d '{
"model": "meta/llama-3.2-11b-vision-instruct",
"messages": [
{
"role":"user",
"content": [
{
"image":
{
"uri": "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg"
}
},
"What is in this image?"
]
}
]
}'
响应
{
"completion_message": {
"role": "assistant",
"content": "This image shows a boardwalk in a field of tall grass. ...",
"stop_reason": "end_of_turn"
},
"logprobs": null
}