Chat Completions - YouRouter Developer Platform

YouRouter 提供两种主要方式来使用聊天模型：

OpenAI 兼容 API：适合绝大多数场景，使用统一接口调用不同模型。
上游提供商原生 API：适合需要统一接口未暴露的上游特有能力的高级场景。

关于如何选择上游提供商和模型，请参考路由指南。

OpenAI 兼容 API

这是使用 YouRouter 最简单、最灵活的方式。你可以直接使用熟悉的 OpenAI SDK，并通过极少的代码变更在不同模型和上游提供商之间切换。

基础用法

下面的示例演示了如何发送一个基础 chat completion 请求。你可以修改 model 和 vendor 请求头，以切换不同模型和上游提供商。

import os
from openai import OpenAI

client = OpenAI(
    api_key=os.environ["YOUROUTER_API_KEY"],
    base_url="https://api.yourouter.ai/v1"
)

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "What is the capital of France?"}
    ],
    extra_headers={"vendor": "openai"}
)

print(response.choices[0].message.content)

高级功能

多轮对话

如果要保持连续对话，只需要把完整对话历史放入 messages 数组中。

import os
from openai import OpenAI

client = OpenAI(
    api_key=os.environ["YOUROUTER_API_KEY"],
    base_url="https://api.yourouter.ai/v1"
)

messages = [
    {"role": "system", "content": "You are a witty assistant that tells jokes."},
    {"role": "user", "content": "Tell me a joke about computers."},
    {"role": "assistant", "content": "Why did the computer keep sneezing? It had a virus!"},
    {"role": "user", "content": "That was a good one. Tell me another."}
]

response = client.chat.completions.create(
    model="gpt-4o",
    messages=messages
)

print(response.choices[0].message.content)

流式响应

对于聊天机器人等实时交互场景，可以用流式输出边生成边返回。把 stream=True 传入请求即可。

import os
from openai import OpenAI

client = OpenAI(
    api_key=os.environ["YOUROUTER_API_KEY"],
    base_url="https://api.yourouter.ai/v1"
)

stream = client.chat.completions.create(
    model="claude-3-haiku-20240307",
    messages=[{"role": "user", "content": "Write a short poem about the ocean."}],
    stream=True,
    extra_headers={"vendor": "anthropic"}
)

for chunk in stream:
    if chunk.choices[0].delta.content is not None:
        print(chunk.choices[0].delta.content, end="")

函数调用 / 工具调用

你可以让模型调用工具或函数，与外部系统交互。这通常是一个多步骤流程：

发送包含工具定义的请求。
模型返回它想调用哪些工具。
你在代码中实际执行这些工具。
再把工具执行结果返回给模型，由模型生成最终自然语言回复。

import json
import os
from openai import OpenAI

client = OpenAI(
    api_key=os.environ["YOUROUTER_API_KEY"],
    base_url="https://api.yourouter.ai/v1"
)

def get_current_weather(location, unit="celsius"):
    """Get the current weather in a given location"""
    if "boston" in location.lower():
        return json.dumps({"location": "Boston", "temperature": "10", "unit": unit})
    else:
        return json.dumps({"location": location, "temperature": "unknown"})

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_current_weather",
            "description": "Get the current weather in a given location",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city and state, e.g. San Francisco, CA"
                    },
                    "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]}
                },
                "required": ["location"]
            }
        }
    }
]

messages = [{"role": "user", "content": "What's the weather like in Boston, MA?"}]

response = client.chat.completions.create(
    model="gpt-4o",
    messages=messages,
    tools=tools,
    tool_choice="auto"
)

response_message = response.choices[0].message
tool_calls = response_message.tool_calls

if tool_calls:
    available_functions = {
        "get_current_weather": get_current_weather,
    }
    messages.append(response_message)

    for tool_call in tool_calls:
        function_name = tool_call.function.name
        function_to_call = available_functions[function_name]
        function_args = json.loads(tool_call.function.arguments)

        function_response = function_to_call(
            location=function_args.get("location"),
            unit=function_args.get("unit"),
        )

        messages.append(
            {
                "tool_call_id": tool_call.id,
                "role": "tool",
                "name": function_name,
                "content": function_response,
            }
        )

    second_response = client.chat.completions.create(
        model="gpt-4o",
        messages=messages,
    )

    final_response = second_response.choices[0].message.content
    print(final_response)

Vision（多模态补全）

很多模型支持多模态输入，可以在请求中直接附带图片。这适用于图片描述、图像分析和视觉问答等场景。gpt-4o、claude-3-5-sonnet-20240620 和 gemini-1.5-pro-latest 等模型都支持视觉能力。

import base64
import os
from openai import OpenAI

client = OpenAI(
    api_key=os.environ["YOUROUTER_API_KEY"],
    base_url="https://api.yourouter.ai/v1"
)

def encode_image(image_path):
  with open(image_path, "rb") as image_file:
    return base64.b64encode(image_file.read()).decode('utf-8')

image_path = "image.jpg"
base64_image = encode_image(image_path)

response = client.chat.completions.create(
    model="claude-3-5-sonnet-20240620",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "What’s in this image?"},
                {
                    "type": "image_url",
                    "image_url": {
                        "url": f"data:image/jpeg;base64,{base64_image}"
                    }
                }
            ]
        }
    ],
    max_tokens=300,
    extra_headers={"vendor": "anthropic"}
)

print(response.choices[0].message.content)

参数

参数	类型	说明	默认值
`model`	string	要使用的模型 ID。	必填
`messages`	array	截至当前的对话消息数组。	必填
`max_tokens`	integer	生成回复时允许的最大 token 数。	`null`
`temperature`	number	采样温度，范围通常在 0 到 2 之间。	`1`
`top_p`	number	另一种采样控制方式，即 nucleus sampling。	`1`
`n`	integer	为每条输入消息生成多少个 completion 结果。	`1`
`stream`	boolean	若开启，则像 ChatGPT 一样以增量消息方式返回。	`false`
`stop`	string or array	最多 4 个停止序列，命中后停止继续生成。	`null`
`presence_penalty`	number	根据 token 是否已出现过，调整新 token 的概率。	`0`
`frequency_penalty`	number	根据 token 的出现频率，调整新 token 的概率。	`0`
`logit_bias`	map	调整指定 token 出现的概率。	`null`
`user`	string	代表终端用户的唯一标识，可用于监控和滥用检测。	`null`
`tool_choice`	string or object	控制模型是否以及如何调用工具。	`none`
`tools`	array	模型可以调用的工具列表。	`null`

上游提供商原生 API

对于一些高级场景，如果你需要统一 OpenAI 兼容接口中没有暴露的上游提供商特有字段或能力，可以直接请求上游提供商的原生接口。这种情况下你必须带上 vendor 请求头。

YouRouter 会将整个请求体，以及除 Authorization 外的所有请求头，原样转发给上游。更多说明见请求透传。

Gemini（Google）

Generate Content

Endpoint: POST /v1/projects/cognition/locations/us/publishers/google/models/{model}:generateContent

import os
import requests
import json

url = "https://api.yourouter.ai/v1/projects/cognition/locations/us/publishers/google/models/gemini-1.5-pro-latest:generateContent"

headers = {
    "Authorization": f"Bearer {os.environ['YOUROUTER_API_KEY']}",
    "Content-Type": "application/json",
    "vendor": "google"
}

data = {
    "contents": [{
        "parts": [{"text": "Write a short story about a time-traveling historian."}]
    }]
}

response = requests.post(url, headers=headers, json=data)

print(json.dumps(response.json(), indent=2))

Safety Settings

你可以通过在请求中加入 safetySettings 来配置内容安全阈值。完整的分类和阈值列表，请参考官方 Google AI 文档。

import os
import requests
import json

url = "https://api.yourouter.ai/v1/projects/cognition/locations/us/publishers/google/models/gemini-pro:generateContent"

headers = {
    "Authorization": f"Bearer {os.environ['YOUROUTER_API_KEY']}",
    "Content-Type": "application/json",
    "vendor": "google"
}

data = {
    "contents": [{"parts": [{"text": "Tell me a potentially controversial joke."}]}],
    "safetySettings": [
        {
            "category": "HARM_CATEGORY_HATE_SPEECH",
            "threshold": "BLOCK_LOW_AND_ABOVE"
        },
        {
            "category": "HARM_CATEGORY_DANGEROUS_CONTENT",
            "threshold": "BLOCK_MEDIUM_AND_ABOVE"
        }
    ]
}

response = requests.post(url, headers=headers, json=data)

print(json.dumps(response.json(), indent=2))

Claude（Anthropic）

Messages API

Endpoint: POST /v1/messages

import os
import requests
import json

url = "https://api.yourouter.ai/v1/messages"

headers = {
    "Authorization": f"Bearer {os.environ['YOUROUTER_API_KEY']}",
    "Content-Type": "application/json",
    "anthropic-version": "2023-06-01",
    "vendor": "anthropic"
}

data = {
    "model": "claude-3-5-sonnet-20240620",
    "max_tokens": 1024,
    "messages": [
        {"role": "user", "content": "Explain the concept of neural networks to a 5-year-old."}
    ]
}

response = requests.post(url, headers=headers, json=data)

print(json.dumps(response.json(), indent=2))

Claude 的工具调用

你可以给 Claude 提供一组工具，它会根据用户请求决定何时调用。整个流程依然是多步对话：你的代码负责执行工具，再把结果回传给 Claude。下面是一个完整的工具调用生命周期示例：

import os
import requests
import json

def get_weather(location):
    if "san francisco" in location.lower():
        return json.dumps({"location": "San Francisco", "temperature": "15°C", "forecast": "Cloudy"})
    else:
        return json.dumps({"location": location, "temperature": "unknown"})

tools = [
    {
        "name": "get_weather",
        "description": "Get the current weather in a given location.",
        "input_schema": {
            "type": "object",
            "properties": {
                "location": {
                    "type": "string",
                    "description": "The city and state, e.g. San Francisco, CA"
                }
            },
            "required": ["location"]
        }
    }
]

messages = [{"role": "user", "content": "What is the weather like in San Francisco?"}]

initial_data = {
    "model": "claude-3-opus-20240229",
    "max_tokens": 1024,
    "tools": tools,
    "messages": messages
}

response = requests.post(
    "https://api.yourouter.ai/v1/messages",
    headers={
        "Authorization": f"Bearer {os.environ['YOUROUTER_API_KEY']}",
        "Content-Type": "application/json",
        "anthropic-version": "2023-06-01",
        "vendor": "anthropic"
    },
    json=initial_data
)

response_data = response.json()

if response_data.get("stop_reason") == "tool_use":
    tool_use_block = next(
        (block for block in response_data["content"] if block.get("type") == "tool_use"), None
    )

    if tool_use_block:
        tool_name = tool_use_block["name"]
        tool_input = tool_use_block["input"]
        tool_use_id = tool_use_block["id"]

        if tool_name == "get_weather":
            tool_result = get_weather(tool_input.get("location", ""))

            messages.append({"role": "assistant", "content": response_data["content"]})
            messages.append({
                "role": "user",
                "content": [
                    {
                        "type": "tool_result",
                        "tool_use_id": tool_use_id,
                        "content": tool_result,
                    }
                ],
            })

            final_data = {
                "model": "claude-3-opus-20240229",
                "max_tokens": 1024,
                "tools": tools,
                "messages": messages
            }

            final_response = requests.post(
                "https://api.yourouter.ai/v1/messages",
                headers={
                    "Authorization": f"Bearer {os.environ['YOUROUTER_API_KEY']}",
                    "Content-Type": "application/json",
                    "anthropic-version": "2023-06-01",
                    "vendor": "anthropic"
                },
                json=final_data
            ).json()

            final_text = next(
                (block["text"] for block in final_response["content"] if block.get("type") == "text"),
                "No final text response found."
            )
            print(final_text)

最佳实践

路由：生产环境建议使用 auto 获得更高可用性。只有在需要固定模型版本或特有能力时，才固定上游提供商。详见路由指南。
错误处理：网络问题和上游提供商故障都可能发生，建议实现可靠的错误处理与指数退避重试，尤其是长耗时任务。
流式输出：凡是面向用户的交互型应用，都建议开启流式输出，提升实时性和体验。
系统提示词：高质量的 system prompt 对模型行为、语气和风格影响很大，建议持续测试和优化。
Token 管理：始终关注输入上下文和输出生成的 token 限制，并利用响应中的 usage 信息跟踪成本与截断风险。

​OpenAI 兼容 API

​基础用法

​高级功能

​多轮对话

​流式响应

​函数调用 / 工具调用

​Vision（多模态补全）

​参数

​上游提供商原生 API

​Gemini（Google）

​Generate Content

​Safety Settings

​Claude（Anthropic）

​Messages API

​Claude 的工具调用

​最佳实践

OpenAI 兼容 API

基础用法

高级功能

多轮对话

流式响应

函数调用 / 工具调用

Vision（多模态补全）

参数

上游提供商原生 API

Gemini（Google）

Generate Content

Safety Settings

Claude（Anthropic）

Messages API

Claude 的工具调用

最佳实践