Quick Start

Get up and running with Korad.AI Gen 3 in minutes.

Overview

Korad.AI Gen 3 is an LLM Optimization Gateway that automatically reduces your AI costs by up to 99% while maintaining output quality. It sits between your application and LLM providers (Anthropic, OpenAI, DeepSeek, etc.), applying intelligent optimization strategies.

How It Works

┌─────────────┐      ┌──────────────────┐      ┌─────────────────┐
│  Your App   │ ───> │  Korad.AI Gen 3  │ ───> │  LLM Providers  │
└─────────────┘      │  Optimization    │      │  (Claude, GPT)  │
                     │     Gateway      │      └─────────────────┘
                     └──────────────────┘
                            │
                            ▼
                     Up to 99% Savings

Prerequisites

Before you begin, ensure you have:

Docker and Docker Compose installed
API keys for at least one LLM provider (Anthropic, OpenAI, DeepSeek, etc.)

Installation

1. Clone the Repository

git clone https://github.com/koradai/bifrost.git
cd bifrost

2. Configure Environment

# Copy the example environment file
cp .env.example .env

# Edit .env and add your API keys
nano .env

Add at least one provider's API key:

# Anthropic (Claude models)
ANTHROPIC_API_KEY=sk-ant-your-key-here

# OpenAI (GPT models)
OPENAI_API_KEY=sk-proj-your-key-here

# DeepSeek
DEEPSEEK_API_KEY=sk-your-key-here

3. Start the Stack

# Start all services with Docker Compose
docker-compose up -d

This starts:

Optimizer (port 8084) - Main API endpoint with optimization tiers
Bifrost Gateway (port 8081) - Virtual keys, governance, MCP tools
Redis (port 6379) - State store for Vanishing Context
MCP Tools Server - Built-in tools (web search, code execution)

4. Verify Deployment

# Run the verification script
python golden_run.py

Your First Request

Get Your Virtual Key

The platform uses Virtual Keys for authentication and governance. Get your key:

curl http://localhost:8081/api/governance/virtual-keys

Copy a virtual key (format: sk-bf-xxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx).

Make a Request

curl -X POST http://localhost:8084/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "x-bf-vk: YOUR_VIRTUAL_KEY_HERE" \
  -d '{
    "model": "anthropic/claude-sonnet-4-5-20250929",
    "messages": [
      {"role": "user", "content": "Hello, world!"}
    ],
    "max_tokens": 100
  }'

Check Your Savings

Every response includes cost attribution headers:

HTTP/1.1 200 OK
Content-Type: application/json
X-Korad-Original-Tokens: 5000
X-Korad-Optimized-Tokens: 5000
X-Korad-Savings-USD: $0.000000
X-Korad-Strategy: Passthrough (no optimization needed)
X-Korad-Billed-Amount: $0.001250
X-Korad-Profit-Margin: 1.50x

Optimization Tiers

The platform automatically applies optimizations based on your request:

Tier	Name	Savings	Trigger
1	Semantic Cache	100%	Duplicate requests
2	Vanishing Context	99%	`X-Vanishing-Context: true`
3	Recursive RLM	97%	`X-Korad-RLM: true`
4	Family-Locked Summary	30%	Contexts > 20k tokens
5	Savings Slider	89%	`X-Savings-Level: extreme`

Learn more in the Savings Waterfall guide.

Using with Claude Code

Configure Claude Code to use Korad.AI:

Option 1: Settings File

Add to ~/.claude/settings.json:

{
  "apiUrl": "http://localhost:8084/v1",
  "apiKey": "sk-bf-YOUR_VIRTUAL_KEY"
}

Option 2: Environment Variables

export OPENAI_API_BASE="http://localhost:8084/v1"
export OPENAI_API_KEY="sk-bf-YOUR_VIRTUAL_KEY"

Next Steps

📖 Read the Savings Waterfall to understand all optimization tiers
🔧 Explore API Reference for all available options
🚀 Check out Integrations for LangChain, LlamaIndex, and more
📦 Learn about Deployment for production

Support

📚 Documentation
💬 Discord
🐛 Issues
📧 Email

Start optimizing your AI costs today.

Overview​

How It Works​

Prerequisites​

Installation​

1. Clone the Repository​

2. Configure Environment​

3. Start the Stack​

4. Verify Deployment​

Your First Request​

Get Your Virtual Key​

Make a Request​

Check Your Savings​

Optimization Tiers​

Using with Claude Code​

Next Steps​

Support​