Skip to main content

Chat Completions

OpenAI-compatible endpoint with automatic optimization POST /v1/chat/completions

Request​

Headers​

Content-Type: application/json
x-bf-vk: sk-bf-YOUR_VIRTUAL_KEY

Body​

{
"model": "anthropic/claude-sonnet-4-5-20250929",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hello, world!"}
],
"max_tokens": 100,
"temperature": 0.7,
"stream": false
}

Parameters​

ParameterTypeRequiredDescription
modelstring✅ YesModel identifier (e.g., anthropic/claude-sonnet-4-5-20250929)
messagesarray✅ YesArray of message objects
max_tokensintegerNoMaximum tokens to generate
temperaturefloatNoSampling temperature (0-1)
streambooleanNoEnable streaming responses
top_pfloatNoNucleus sampling threshold
stopstring/arrayNoStop sequences

Response​

Standard Response​

{
"id": "chatcmpl-123",
"object": "chat.completion",
"created": 1677652288,
"model": "anthropic/claude-sonnet-4-5-20250929",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! How can I help you today?"
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 10,
"completion_tokens": 9,
"total_tokens": 19
}
}

Response Headers​

HTTP/1.1 200 OK
Content-Type: application/json
X-Korad-Original-Tokens: 10
X-Korad-Optimized-Tokens: 10
X-Korad-Savings-USD: $0.000000
X-Korad-Strategy: Passthrough (no optimization needed)
X-Korad-Billed-Amount: $0.000057
X-Korad-Profit-Margin: 1.50x

Examples​

Basic Request​

curl -X POST http://localhost:8084/v1/chat/completions \
-H "Content-Type: application/json" \
-H "x-bf-vk: sk-bf-YOUR_VIRTUAL_KEY" \
-d '{
"model": "anthropic/claude-sonnet-4-5-20250929",
"messages": [
{"role": "user", "content": "Hello!"}
],
"max_tokens": 100
}'

With Savings Slider​

curl -X POST http://localhost:8084/v1/chat/completions \
-H "Content-Type: application/json" \
-H "X-Savings-Level: extreme" \
-H "x-bf-vk: sk-bf-YOUR_VIRTUAL_KEY" \
-d '{
"model": "anthropic/claude-sonnet-4-5-20250929",
"messages": [{"role": "user", "content": "...large content..."}]
}'

Streaming Response​

curl -X POST http://localhost:8084/v1/chat/completions \
-H "Content-Type: application/json" \
-H "x-bf-vk: sk-bf-YOUR_VIRTUAL_KEY" \
-d '{
"model": "anthropic/claude-sonnet-4-5-20250929",
"messages": [{"role": "user", "content": "Count to 10"}],
"stream": true
}'

Python SDK Example​

from openai import OpenAI

client = OpenAI(
base_url="http://localhost:8084/v1",
api_key="sk-bf-YOUR_VIRTUAL_KEY"
)

response = client.chat.completions.create(
model="anthropic/claude-sonnet-4-5-20250929",
messages=[
{"role": "system", "content": "You are helpful."},
{"role": "user", "content": "Explain quantum computing"}
],
max_tokens=500
)

print(response.choices[0].message.content)

JavaScript SDK Example​

import OpenAI from 'openai';

const openai = new OpenAI({
baseURL: 'http://localhost:8084/v1',
apiKey: 'sk-bf-YOUR_VIRTUAL_KEY',
});

const completion = await openai.chat.completions.create({
model: 'anthropic/claude-sonnet-4-5-20250929',
messages: [
{ role: 'user', content: 'Hello!' }
],
});

console.log(completion.choices[0].message.content);

Error Handling​

{
"error": {
"message": "Invalid model name",
"type": "invalid_request_error",
"param": "model",
"code": "invalid_model"
}
}

See Errors for more details.


100% OpenAI-compatible - just change the base URL and API key.