New
Conversation Summarization
Automatic summarization of long conversations to preserve context while staying within token limits.
Overview
Long conversations can exceed token limits, causing context to be lost. Conversation summarization automatically compresses older messages into a summary, preserving important context while keeping the conversation within limits.
Memory Strategies
truncate
Simply drops oldest messages when limit is reached. Fast but loses context.
summarize ⭐
Summarizes older messages into a condensed form. Preserves context intelligently.
sliding_window
Keeps only the most recent N messages. Good for short-term context.
Basic Usage
summarization.ts
import OpenAI from 'openai';
const client = new OpenAI({
baseURL: 'https://www.superagentstack.com/api/v1',
apiKey: process.env.OPENROUTER_KEY,
defaultHeaders: { 'superAgentKey': process.env.SUPER_AGENT_KEY },
});
const response = await client.chat.completions.create({
model: 'openai/gpt-4o-mini',
messages: [{ role: 'user', content: 'Continue our discussion' }],
sessionId: crypto.randomUUID(),
saveToMemory: true,
memoryStrategy: 'summarize', // Use summarization
summaryThreshold: 20, // Summarize after 20 messages
});Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
memoryStrategy | string | "truncate" | "truncate" | "summarize" | "sliding_window" |
summaryThreshold | number | 50 | Number of messages before summarization triggers. Minimum: 10. |
Minimum Threshold
The
summaryThreshold must be at least 10. Lower values will be rejected.How Summarization Works
- Conversation reaches the
summaryThreshold - Older messages (before the threshold) are sent to the LLM for summarization
- The summary is stored and replaces the older messages
- Recent messages are kept intact for immediate context
- Future requests include: summary + recent messages
Example Flow
summarization-flow.ts
// Messages 1-20: Normal conversation
// ...
// Message 21: Threshold reached, summarization triggers
// Old messages (1-15) → Summarized
// Recent messages (16-21) → Kept intact
// Message 22+: Context includes:
// - Summary of messages 1-15
// - Full messages 16-22
// The AI maintains context without exceeding token limits!Strategy Comparison
| Strategy | Context Preservation | Speed | Best For |
|---|---|---|---|
| truncate | Low | Fast | Simple chats, cost-sensitive |
| summarize | High | Medium | Long conversations, complex topics |
| sliding_window | Medium | Fast | Recent context only matters |
Best Practices
- Use
summarizefor customer support and long-form conversations - Use
sliding_windowfor quick Q&A where only recent context matters - Set
summaryThresholdbased on your typical conversation length - Higher thresholds = more context before summarization, but higher token usage