Rate Limits and Concurrency

YouRouter is designed to support production model API traffic through one integration. For model calls, you should still build standard retry and concurrency handling because upstream model providers can enforce their own limits.

What to Expect

Area	Behavior
YouRouter gateway	Does not require a separate per-endpoint integration for each model provider.
Provider limits	The upstream provider may return rate limit or concurrency errors.
Automatic routing	Omitting `vendor`, or using `vendor: auto`, lets YouRouter route to available providers for the requested model.
Pinned provider	If you set `vendor`, your request depends on that provider’s availability and limits.

Recommended Retry Pattern

For model API calls, retry transient failures with exponential backoff.

1st retry: wait 1 second
2nd retry: wait 2 seconds
3rd retry: wait 4 seconds
then stop or move to a fallback path

Use retries for:

429 rate limit or concurrency responses
500 gateway or provider errors
temporary network failures

Do not retry immediately in a tight loop. That can make provider-side limits worse.

Concurrency Tips

Start with conservative concurrency in production and increase gradually.
Use vendor: auto unless your integration requires a specific provider.
Keep model IDs configurable so you can switch models without code changes.
Log the request timestamp, model, vendor mode, and request ID for troubleshooting.
For streaming responses, treat dropped connections as retryable only if your application can safely restart the request.

Example: Automatic Routing

curl https://api.yourouter.ai/v1/chat/completions \
  -H "Authorization: Bearer $YOUROUTER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "messages": [
      {
        "role": "user",
        "content": "Reply with exactly: connected"
      }
    ]
  }'

Example: Provider Pinning

curl https://api.yourouter.ai/v1/chat/completions \
  -H "Authorization: Bearer $YOUROUTER_API_KEY" \
  -H "Content-Type: application/json" \
  -H "vendor: openai" \
  -d '{
    "model": "gpt-4o",
    "messages": [
      {
        "role": "user",
        "content": "Reply with exactly: connected"
      }
    ]
  }'

For routing behavior, see the Router guide. For request fields, see Create Chat Completion.

Getting Started

Model APIs

Routing & Reliability

API Reference

Features

Legal

What to Expect

Recommended Retry Pattern

Concurrency Tips

Example: Automatic Routing

Example: Provider Pinning

Getting Started

Model APIs

Routing & Reliability

API Reference

Features

Legal

​What to Expect

​Recommended Retry Pattern

​Concurrency Tips

​Example: Automatic Routing

​Example: Provider Pinning

What to Expect

Recommended Retry Pattern

Concurrency Tips

Example: Automatic Routing

Example: Provider Pinning