Request Headers
Fine-tune optimization behavior Control how your requests are optimized with HTTP headers.
Authentication Headers​
x-bf-vk (Required)​
Your Virtual Key for authentication and governance.
x-bf-vk: sk-bf-YOUR_VIRTUAL_KEY
Optimization Control Headers​
X-Savings-Level (Tier 5)​
Control the maximum context window size.
| Level | Max Tokens | Savings | Best For |
|---|---|---|---|
extreme | 16,384 | 89% | Maximum compression |
max | 32,768 | 78% | Balanced cost/quality |
med | 65,536 | 56% | High fidelity |
min | 131,072 | 12% | Near-original quality |
off | Unlimited | 0% | No optimization |
X-Savings-Level: extreme
response = client.chat.completions.create(
model="anthropic/claude-sonnet-4-5-20250929",
messages=[...],
extra_headers={
"X-Savings-Level": "extreme"
}
)
X-Vanishing-Context (Tier 2)​
Enable document-aware optimization for large texts.
X-Vanishing-Context: true
Best for:
- Document QA (>100 pages)
- Legal contract analysis
- Technical manual queries
- Research paper synthesis
X-Korad-RLM (Tier 3)​
Enable recursive summarization for complex reasoning.
X-Korad-RLM: true
Best for:
- Multi-step document analysis
- Codebase analysis across files
- Complex reasoning tasks
- Evolving conversation threads
Combining Headers​
You can combine headers for specific scenarios:
# Document QA with cost control
headers = {
"X-Vanishing-Context": "true",
"X-Savings-Level": "med" # Fallback cap
}
# Complex reasoning with hard cap
headers = {
"X-Korad-RLM": "true",
"X-Savings-Level": "max" # Prevent runaway costs
}
Header Priority​
Headers are checked in this order:
X-Savings-Level- Tier 5 (highest priority)X-Vanishing-Context- Tier 2X-Korad-RLM- Tier 3- (No header) - Tier 4 (default)
Response Headers​
Every response includes detailed cost attribution:
Token Information​
X-Korad-Original-Tokens: 150006
X-Korad-Optimized-Tokens: 32743
X-Korad-Tokens-Saved: 117263
X-Korad-Reduction-Pct: 78%
Cost Information​
X-Korad-Savings-USD: $0.029214
X-Korad-Theoretical-Cost: $0.037502
X-Korad-Actual-Cost: $0.008288
X-Korad-Billed-Amount: $0.056253
X-Korad-Profit-Margin: 1.50x
Strategy Information​
X-Korad-Strategy: Savings-Slider (max: 32768 tokens)
Strategy values:
Passthrough- No optimization neededSavings-Slider (level: N tokens)- Hard cap appliedVanishing-Context (redis peek+fetch)- Document optimizationRecursive-RLM (controller: model)- Recursive summarizationFamily-Lock (model -> model)- Provider family summarizationSemantic-Cache- Cached response
Control your optimization strategy with headers.