Skip to main content

Request Headers

Fine-tune optimization behavior Control how your requests are optimized with HTTP headers.

Authentication Headers​

x-bf-vk (Required)​

Your Virtual Key for authentication and governance.

x-bf-vk: sk-bf-YOUR_VIRTUAL_KEY

Optimization Control Headers​

X-Savings-Level (Tier 5)​

Control the maximum context window size.

LevelMax TokensSavingsBest For
extreme16,38489%Maximum compression
max32,76878%Balanced cost/quality
med65,53656%High fidelity
min131,07212%Near-original quality
offUnlimited0%No optimization
X-Savings-Level: extreme
response = client.chat.completions.create(
model="anthropic/claude-sonnet-4-5-20250929",
messages=[...],
extra_headers={
"X-Savings-Level": "extreme"
}
)

X-Vanishing-Context (Tier 2)​

Enable document-aware optimization for large texts.

X-Vanishing-Context: true

Best for:

  • Document QA (>100 pages)
  • Legal contract analysis
  • Technical manual queries
  • Research paper synthesis

X-Korad-RLM (Tier 3)​

Enable recursive summarization for complex reasoning.

X-Korad-RLM: true

Best for:

  • Multi-step document analysis
  • Codebase analysis across files
  • Complex reasoning tasks
  • Evolving conversation threads

Combining Headers​

You can combine headers for specific scenarios:

# Document QA with cost control
headers = {
"X-Vanishing-Context": "true",
"X-Savings-Level": "med" # Fallback cap
}

# Complex reasoning with hard cap
headers = {
"X-Korad-RLM": "true",
"X-Savings-Level": "max" # Prevent runaway costs
}

Header Priority​

Headers are checked in this order:

  1. X-Savings-Level - Tier 5 (highest priority)
  2. X-Vanishing-Context - Tier 2
  3. X-Korad-RLM - Tier 3
  4. (No header) - Tier 4 (default)

Response Headers​

Every response includes detailed cost attribution:

Token Information​

X-Korad-Original-Tokens: 150006
X-Korad-Optimized-Tokens: 32743
X-Korad-Tokens-Saved: 117263
X-Korad-Reduction-Pct: 78%

Cost Information​

X-Korad-Savings-USD: $0.029214
X-Korad-Theoretical-Cost: $0.037502
X-Korad-Actual-Cost: $0.008288
X-Korad-Billed-Amount: $0.056253
X-Korad-Profit-Margin: 1.50x

Strategy Information​

X-Korad-Strategy: Savings-Slider (max: 32768 tokens)

Strategy values:

  • Passthrough - No optimization needed
  • Savings-Slider (level: N tokens) - Hard cap applied
  • Vanishing-Context (redis peek+fetch) - Document optimization
  • Recursive-RLM (controller: model) - Recursive summarization
  • Family-Lock (model -> model) - Provider family summarization
  • Semantic-Cache - Cached response

Control your optimization strategy with headers.