Provider Routing

Control which providers serve your requests for cost optimization, latency reduction, or compliance requirements.

What is Provider Routing?

Many models are available through multiple providers (e.g., Llama through Together, Fireworks, or Groq). Provider routing lets you specify which providers to use, their priority order, and fallback behavior.

🎯 Use Cases

• Optimize for lowest cost
• Minimize latency
• Ensure data residency compliance
• Prefer specific provider features

✅ Benefits

• Fine-grained control over routing
• Cost savings up to 50%+
• Improved reliability
• Compliance with data policies

Basic Usage

provider-routing.ts

import OpenAI from 'openai';

const client = new OpenAI({
  baseURL: 'https://www.superagentstack.com/api/v1',
  apiKey: process.env.OPENROUTER_KEY,
  defaultHeaders: { 'superAgentKey': process.env.SUPER_AGENT_KEY },
});

const response = await client.chat.completions.create({
  model: 'meta-llama/llama-3.1-70b-instruct',
  messages: [{
    role: 'user',
    content: 'Explain machine learning'
  }],
  // @ts-expect-error - OpenRouter extension
  provider: {
    order: ['Together', 'Fireworks', 'Groq'],  // Try providers in this order
    allow_fallbacks: true                       // Allow other providers if all fail
  }
});

console.log(response.choices[0].message.content);

Provider Configuration Options

Option	Type	Description
`order`	string[]	List of providers to try in order
`require_parameters`	boolean	Only use providers supporting all request parameters
`allow_fallbacks`	boolean	Allow providers not in the order list as fallback
`data_collection`	"allow" \| "deny"	Control whether providers can use data for training

Strict Provider Selection

Use allow_fallbacks: false to ensure only your specified providers are used.

strict-provider.ts

const response = await client.chat.completions.create({
  model: 'meta-llama/llama-3.1-70b-instruct',
  messages: [{ role: 'user', content: 'Hello' }],
  // @ts-expect-error - OpenRouter extension
  provider: {
    order: ['Together'],       // Only use Together
    allow_fallbacks: false     // Don't fall back to other providers
  }
});

// If Together is unavailable, the request will fail
// instead of routing to another provider

Availability Risk

Disabling fallbacks means your request will fail if all specified providers are unavailable. Use this only when provider selection is critical.

Requiring Feature Support

Use require_parameters: true to only route to providers that support all the features you're using.

require-parameters.ts

const response = await client.chat.completions.create({
  model: 'meta-llama/llama-3.1-70b-instruct',
  messages: [{ role: 'user', content: 'Extract data...' }],
  // Using structured outputs
  response_format: {
    type: 'json_schema',
    json_schema: {
      name: 'data',
      schema: { type: 'object', properties: { name: { type: 'string' } } }
    }
  },
  // @ts-expect-error - OpenRouter extension
  provider: {
    require_parameters: true  // Only use providers that support json_schema
  }
});

// Only providers supporting structured outputs will be used

Data Collection Control

Control whether providers can use your data for model training.

data-collection.ts

// Prevent data from being used for training
const response = await client.chat.completions.create({
  model: 'openai/gpt-4o',
  messages: [{ role: 'user', content: 'Confidential business data...' }],
  // @ts-expect-error - OpenRouter extension
  provider: {
    data_collection: 'deny'  // Don't allow data collection
  }
});

// Allow data collection (may reduce costs with some providers)
const response2 = await client.chat.completions.create({
  model: 'openai/gpt-4o',
  messages: [{ role: 'user', content: 'General question...' }],
  // @ts-expect-error - OpenRouter extension
  provider: {
    data_collection: 'allow'
  }
});

Common Providers

Provider	Strengths	Models
OpenAI	Best GPT models, reliable	GPT-4o, GPT-4o-mini
Anthropic	Claude models, safety focus	Claude 3.5 Sonnet
Together	Open models, good pricing	Llama, Mistral
Fireworks	Fast inference, open models	Llama, Mixtral
Groq	Ultra-fast inference	Llama, Mixtral
Google	Gemini models	Gemini 2.0 Flash

curl Example

bash

curl -X POST https://www.superagentstack.com/api/v1/chat/completions \
  -H "Authorization: Bearer $OPENROUTER_KEY" \
  -H "superAgentKey: $SUPER_AGENT_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "meta-llama/llama-3.1-70b-instruct",
    "messages": [{"role": "user", "content": "Hello"}],
    "provider": {
      "order": ["Together", "Fireworks"],
      "allow_fallbacks": true,
      "data_collection": "deny"
    }
  }'

Best Practices

Enable Fallbacks for Production

Keep allow_fallbacks: true in production to ensure high availability, unless you have strict compliance requirements.

Test Provider Compatibility

Different providers may have slightly different behaviors. Test your application with each provider in your order list.

Monitor Costs

Provider pricing varies significantly. Monitor which providers are being used and adjust your order list to optimize costs.