Quick Start
Get up and running with Korad.AI Gen 3 in minutes.
Overview
Korad.AI Gen 3 is an LLM Optimization Gateway that automatically reduces your AI costs by up to 99% while maintaining output quality. It sits between your application and LLM providers (Anthropic, OpenAI, DeepSeek, etc.), applying intelligent optimization strategies.
How It Works
┌─────────────┐ ┌──────────────────┐ ┌─────────────────┐
│ Your App │ ───> │ Korad.AI Gen 3 │ ───> │ LLM Providers │
└─────────────┘ │ Optimization │ │ (Claude, GPT) │
│ Gateway │ └─────────────────┘
└──────────────────┘
│
▼
Up to 99% Savings
Prerequisites
Before you begin, ensure you have:
- Docker and Docker Compose installed
- API keys for at least one LLM provider (Anthropic, OpenAI, DeepSeek, etc.)
Installation
1. Clone the Repository
git clone https://github.com/koradai/bifrost.git
cd bifrost
2. Configure Environment
# Copy the example environment file
cp .env.example .env
# Edit .env and add your API keys
nano .env
Add at least one provider's API key:
# Anthropic (Claude models)
ANTHROPIC_API_KEY=sk-ant-your-key-here
# OpenAI (GPT models)
OPENAI_API_KEY=sk-proj-your-key-here
# DeepSeek
DEEPSEEK_API_KEY=sk-your-key-here
3. Start the Stack
# Start all services with Docker Compose
docker-compose up -d
This starts:
- Optimizer (port 8084) - Main API endpoint with optimization tiers
- Bifrost Gateway (port 8081) - Virtual keys, governance, MCP tools
- Redis (port 6379) - State store for Vanishing Context
- MCP Tools Server - Built-in tools (web search, code execution)
4. Verify Deployment
# Run the verification script
python golden_run.py
Your First Request
Get Your Virtual Key
The platform uses Virtual Keys for authentication and governance. Get your key:
curl http://localhost:8081/api/governance/virtual-keys
Copy a virtual key (format: sk-bf-xxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx).
Make a Request
curl -X POST http://localhost:8084/v1/chat/completions \
-H "Content-Type: application/json" \
-H "x-bf-vk: YOUR_VIRTUAL_KEY_HERE" \
-d '{
"model": "anthropic/claude-sonnet-4-5-20250929",
"messages": [
{"role": "user", "content": "Hello, world!"}
],
"max_tokens": 100
}'
Check Your Savings
Every response includes cost attribution headers:
HTTP/1.1 200 OK
Content-Type: application/json
X-Korad-Original-Tokens: 5000
X-Korad-Optimized-Tokens: 5000
X-Korad-Savings-USD: $0.000000
X-Korad-Strategy: Passthrough (no optimization needed)
X-Korad-Billed-Amount: $0.001250
X-Korad-Profit-Margin: 1.50x
Optimization Tiers
The platform automatically applies optimizations based on your request:
| Tier | Name | Savings | Trigger |
|---|---|---|---|
| 1 | Semantic Cache | 100% | Duplicate requests |
| 2 | Vanishing Context | 99% | X-Vanishing-Context: true |
| 3 | Recursive RLM | 97% | X-Korad-RLM: true |
| 4 | Family-Locked Summary | 30% | Contexts > 20k tokens |
| 5 | Savings Slider | 89% | X-Savings-Level: extreme |
Learn more in the Savings Waterfall guide.
Using with Claude Code
Configure Claude Code to use Korad.AI:
Option 1: Settings File
Add to ~/.claude/settings.json:
{
"apiUrl": "http://localhost:8084/v1",
"apiKey": "sk-bf-YOUR_VIRTUAL_KEY"
}
Option 2: Environment Variables
export OPENAI_API_BASE="http://localhost:8084/v1"
export OPENAI_API_KEY="sk-bf-YOUR_VIRTUAL_KEY"
Next Steps
- 📖 Read the Savings Waterfall to understand all optimization tiers
- 🔧 Explore API Reference for all available options
- 🚀 Check out Integrations for LangChain, LlamaIndex, and more
- 📦 Learn about Deployment for production
Support
- 📚 Documentation
- 💬 Discord
- 🐛 Issues
Start optimizing your AI costs today.