Provider Routing

Control which providers serve your requests for cost optimization, latency reduction, or compliance requirements.

What is Provider Routing?

Many models are available through multiple providers (e.g., Llama through Together, Fireworks, or Groq). Provider routing lets you specify which providers to use, their priority order, and fallback behavior.

🎯 Use Cases

  • • Optimize for lowest cost
  • • Minimize latency
  • • Ensure data residency compliance
  • • Prefer specific provider features

✅ Benefits

  • • Fine-grained control over routing
  • • Cost savings up to 50%+
  • • Improved reliability
  • • Compliance with data policies

Basic Usage

provider-routing.ts
import OpenAI from 'openai';

const client = new OpenAI({
  baseURL: 'https://www.superagentstack.com/api/v1',
  apiKey: process.env.OPENROUTER_KEY,
  defaultHeaders: { 'superAgentKey': process.env.SUPER_AGENT_KEY },
});

const response = await client.chat.completions.create({
  model: 'meta-llama/llama-3.1-70b-instruct',
  messages: [{
    role: 'user',
    content: 'Explain machine learning'
  }],
  // @ts-expect-error - OpenRouter extension
  provider: {
    order: ['Together', 'Fireworks', 'Groq'],  // Try providers in this order
    allow_fallbacks: true                       // Allow other providers if all fail
  }
});

console.log(response.choices[0].message.content);

Provider Configuration Options

OptionTypeDescription
orderstring[]List of providers to try in order
require_parametersbooleanOnly use providers supporting all request parameters
allow_fallbacksbooleanAllow providers not in the order list as fallback
data_collection"allow" | "deny"Control whether providers can use data for training

Strict Provider Selection

Use allow_fallbacks: false to ensure only your specified providers are used.

strict-provider.ts
const response = await client.chat.completions.create({
  model: 'meta-llama/llama-3.1-70b-instruct',
  messages: [{ role: 'user', content: 'Hello' }],
  // @ts-expect-error - OpenRouter extension
  provider: {
    order: ['Together'],       // Only use Together
    allow_fallbacks: false     // Don't fall back to other providers
  }
});

// If Together is unavailable, the request will fail
// instead of routing to another provider

Availability Risk

Disabling fallbacks means your request will fail if all specified providers are unavailable. Use this only when provider selection is critical.

Requiring Feature Support

Use require_parameters: true to only route to providers that support all the features you're using.

require-parameters.ts
const response = await client.chat.completions.create({
  model: 'meta-llama/llama-3.1-70b-instruct',
  messages: [{ role: 'user', content: 'Extract data...' }],
  // Using structured outputs
  response_format: {
    type: 'json_schema',
    json_schema: {
      name: 'data',
      schema: { type: 'object', properties: { name: { type: 'string' } } }
    }
  },
  // @ts-expect-error - OpenRouter extension
  provider: {
    require_parameters: true  // Only use providers that support json_schema
  }
});

// Only providers supporting structured outputs will be used

Data Collection Control

Control whether providers can use your data for model training.

data-collection.ts
// Prevent data from being used for training
const response = await client.chat.completions.create({
  model: 'openai/gpt-4o',
  messages: [{ role: 'user', content: 'Confidential business data...' }],
  // @ts-expect-error - OpenRouter extension
  provider: {
    data_collection: 'deny'  // Don't allow data collection
  }
});

// Allow data collection (may reduce costs with some providers)
const response2 = await client.chat.completions.create({
  model: 'openai/gpt-4o',
  messages: [{ role: 'user', content: 'General question...' }],
  // @ts-expect-error - OpenRouter extension
  provider: {
    data_collection: 'allow'
  }
});

Common Providers

ProviderStrengthsModels
OpenAIBest GPT models, reliableGPT-4o, GPT-4o-mini
AnthropicClaude models, safety focusClaude 3.5 Sonnet
TogetherOpen models, good pricingLlama, Mistral
FireworksFast inference, open modelsLlama, Mixtral
GroqUltra-fast inferenceLlama, Mixtral
GoogleGemini modelsGemini 2.0 Flash

curl Example

bash
curl -X POST https://www.superagentstack.com/api/v1/chat/completions \
  -H "Authorization: Bearer $OPENROUTER_KEY" \
  -H "superAgentKey: $SUPER_AGENT_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "meta-llama/llama-3.1-70b-instruct",
    "messages": [{"role": "user", "content": "Hello"}],
    "provider": {
      "order": ["Together", "Fireworks"],
      "allow_fallbacks": true,
      "data_collection": "deny"
    }
  }'

Best Practices

Enable Fallbacks for Production

Keep allow_fallbacks: true in production to ensure high availability, unless you have strict compliance requirements.

Test Provider Compatibility

Different providers may have slightly different behaviors. Test your application with each provider in your order list.

Monitor Costs

Provider pricing varies significantly. Monitor which providers are being used and adjust your order list to optimize costs.

Next Steps