Skip to main content
YouRouter is designed to support production model API traffic through one integration. For model calls, you should still build standard retry and concurrency handling because upstream model providers can enforce their own limits.

What to Expect

AreaBehavior
YouRouter gatewayDoes not require a separate per-endpoint integration for each model provider.
Provider limitsThe upstream provider may return rate limit or concurrency errors.
Automatic routingOmitting vendor, or using vendor: auto, lets YouRouter route to available providers for the requested model.
Pinned providerIf you set vendor, your request depends on that provider’s availability and limits.
For model API calls, retry transient failures with exponential backoff.
1st retry: wait 1 second
2nd retry: wait 2 seconds
3rd retry: wait 4 seconds
then stop or move to a fallback path
Use retries for:
  • 429 rate limit or concurrency responses
  • 500 gateway or provider errors
  • temporary network failures
Do not retry immediately in a tight loop. That can make provider-side limits worse.

Concurrency Tips

  • Start with conservative concurrency in production and increase gradually.
  • Use vendor: auto unless your integration requires a specific provider.
  • Keep model IDs configurable so you can switch models without code changes.
  • Log the request timestamp, model, vendor mode, and request ID for troubleshooting.
  • For streaming responses, treat dropped connections as retryable only if your application can safely restart the request.

Example: Automatic Routing

curl https://api.yourouter.ai/v1/chat/completions \
  -H "Authorization: Bearer $YOUROUTER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "messages": [
      {
        "role": "user",
        "content": "Reply with exactly: connected"
      }
    ]
  }'

Example: Provider Pinning

curl https://api.yourouter.ai/v1/chat/completions \
  -H "Authorization: Bearer $YOUROUTER_API_KEY" \
  -H "Content-Type: application/json" \
  -H "vendor: openai" \
  -d '{
    "model": "gpt-4o",
    "messages": [
      {
        "role": "user",
        "content": "Reply with exactly: connected"
      }
    ]
  }'
For routing behavior, see the Router guide. For request fields, see Create Chat Completion.