Error Handling
Understand and handle API errors OpenAI-compatible error responses.
Error Response Format​
All errors return in the following format:
{
"error": {
"message": "Error message describing the issue",
"type": "error_type",
"param": "parameter_name",
"code": "error_code"
}
}
Common Errors​
Authentication Errors​
401 Unauthorized​
Invalid or missing Virtual Key.
{
"error": {
"message": "Invalid Virtual Key",
"type": "invalid_request_error",
"code": "invalid_api_key"
}
}
Solution: Check that x-bf-vk header is set correctly.
Rate Limit Errors​
429 Rate Limit Exceeded​
Exceeded your rate limit or budget.
{
"error": {
"message": "Rate limit exceeded: 100 requests/hour",
"type": "rate_limit_error",
"code": "rate_limit_exceeded"
}
}
Solution:
- Implement exponential backoff
- Check your Virtual Key limits
- Consider upgrading your plan
Request Errors​
400 Bad Request​
Invalid request parameters.
{
"error": {
"message": "Invalid model name",
"type": "invalid_request_error",
"param": "model",
"code": "invalid_model"
}
}
Common causes:
- Invalid model name (use
provider/model-nameformat) - Invalid
X-Savings-Levelvalue - Missing required fields
Budget Errors​
402 Payment Required​
Budget exceeded.
{
"error": {
"message": "Budget exceeded: $50.00 limit reached",
"type": "budget_exceeded_error",
"code": "budget_exceeded"
}
}
Solution:
- Wait for budget reset (monthly/weekly)
- Increase budget limit
- Contact administrator
Server Errors​
500 Internal Server Error​
Unexpected server error.
{
"error": {
"message": "Internal server error",
"type": "server_error",
"code": "internal_error"
}
}
Solution:
- Retry with exponential backoff
- Check status page: https://status.korad.ai
- Contact support if persistent
503 Service Unavailable​
Service temporarily unavailable.
{
"error": {
"message": "Service temporarily unavailable",
"type": "server_error",
"code": "service_unavailable"
}
}
Solution:
- Retry after a delay
- Check status page
- Implement retry logic
HTTP Status Codes​
| Code | Meaning | Description |
|---|---|---|
| 200 | OK | Request successful |
| 400 | Bad Request | Invalid request parameters |
| 401 | Unauthorized | Invalid or missing API key |
| 402 | Payment Required | Budget exceeded |
| 429 | Too Many Requests | Rate limit exceeded |
| 500 | Internal Server Error | Server error |
| 503 | Service Unavailable | Service temporarily down |
Error Types​
| Type | Description |
|---|---|
invalid_request_error | Invalid request parameters |
invalid_api_key | Invalid or missing API key |
rate_limit_error | Rate limit exceeded |
budget_exceeded_error | Budget limit reached |
server_error | Server-side error |
timeout_error | Request timeout |
Best Practices​
1. Implement Retry Logic​
import time
from openai import OpenAI
client = OpenAI(
base_url="http://localhost:8084/v1",
api_key="sk-bf-YOUR_VIRTUAL_KEY"
)
def create_completion_with_retry(messages, max_retries=3):
for attempt in range(max_retries):
try:
return client.chat.completions.create(
model="anthropic/claude-sonnet-4-5-20250929",
messages=messages,
max_tokens=500
)
except Exception as e:
if attempt < max_retries - 1:
time.sleep(2 ** attempt) # Exponential backoff
else:
raise
2. Check Error Headers​
response = client.chat.completions.create(...)
# Check rate limit headers
rate_limit_remaining = response.response_headers.get('X-RateLimit-Remaining')
rate_limit_reset = response.response_headers.get('X-RateLimit-Reset')
if int(rate_limit_remaining) < 10:
# Slow down requests
time.sleep(1)
3. Handle Specific Errors​
try:
response = client.chat.completions.create(...)
except Exception as e:
error_code = getattr(e, 'code', None)
if error_code == 'rate_limit_exceeded':
# Implement backoff
time.sleep(60)
elif error_code == 'budget_exceeded':
# Notify user
print("Budget exceeded. Please wait for reset.")
else:
raise
4. Monitor Savings Headers​
response = client.chat.completions.create(...)
# Log optimization metrics
original_tokens = response.response_headers.get('X-Korad-Original-Tokens')
optimized_tokens = response.response_headers.get('X-Korad-Optimized-Tokens')
savings = response.response_headers.get('X-Korad-Savings-USD')
print(f"Saved ${savings} ({original_tokens} -> {optimized_tokens} tokens)")
Handle errors gracefully for a better user experience.