多配置 API

本指南介绍如何将多个配置用作同一服务器 API 调用的一部分。

动机

在运行 guardrails 服务器时，创建*原子配置*非常方便，这些配置可以在多个“完整”配置中重用。在本指南中，我们使用这些示例配置

input_checking: 它使用自检输入 rail。
output_checking: 它使用自检输出 rail。
main: 它使用 gpt-3.5-turbo-instruct 模型，不带 guardrails。

# Get rid of the TOKENIZERS_PARALLELISM warning
import warnings
warnings.filterwarnings('ignore')

先决条件

安装 openai 包

pip install openai

设置 OPENAI_API_KEY 环境变量

export OPENAI_API_KEY=$OPENAI_API_KEY    # Replace with your own key

如果您在 notebook 中运行此代码，请修补 AsyncIO 循环。

import nest_asyncio

nest_asyncio.apply()

设置

在本指南中，服务器以编程方式启动，如下所示。这等效于（从项目根目录）

nemoguardrails server --config=examples/server_configs/atomic

import os
from nemoguardrails.server.api import app
from threading import Thread
import uvicorn

def run_server():
    current_path = %pwd
    app.rails_config_path = os.path.normpath(os.path.join(current_path, "..", "..", "..", "examples", "server_configs", "atomic"))

    uvicorn.run(app, host="127.0.0.1", port=8000, log_level="info")

# Start the server in a separate thread so that you can still use the notebook
thread = Thread(target=run_server)
thread.start()

您可以使用 /v1/rails/configs 端点检查可用的配置

import requests

base_url = "http://127.0.0.1:8000"

response = requests.get(f"{base_url}/v1/rails/configs")
print(response.json())

[{'id': 'output_checking'}, {'id': 'main'}, {'id': 'input_checking'}]

您可以使用单个配置进行调用，如下所示

response = requests.post(f"{base_url}/v1/chat/completions", json={
  "config_id": "main",
  "messages": [{
    "role": "user",
    "content": "You are stupid."
  }]
})
print(response.json())

要使用多个配置，您必须在请求正文中使用 config_ids 字段而不是 config_id，如下所示

response = requests.post(f"{base_url}/v1/chat/completions", json={
  "config_ids": ["main", "input_checking"],
  "messages": [{
    "role": "user",
    "content": "You are stupid."
  }]
})
print(response.json())

{'messages': [{'role': 'assistant', 'content': "I'm sorry, I can't respond to that."}]}

如您所见，在第一个调用中，LLM 与用户的请求进行了交互。它确实拒绝参与，但理想情况下，我们不希望请求到达 LLM。在第二个调用中，输入 rail 启动并阻止了请求。

结论

本指南介绍了如何使用多个配置 ID 向 guardrails 服务器发出请求。这在各种情况下都很有用，并且鼓励在各种多个配置中重用，而无需代码重复。