Skip to main content

Error Handling

Understand and handle API errors OpenAI-compatible error responses.

Error Response Format​

All errors return in the following format:

{
"error": {
"message": "Error message describing the issue",
"type": "error_type",
"param": "parameter_name",
"code": "error_code"
}
}

Common Errors​

Authentication Errors​

401 Unauthorized​

Invalid or missing Virtual Key.

{
"error": {
"message": "Invalid Virtual Key",
"type": "invalid_request_error",
"code": "invalid_api_key"
}
}

Solution: Check that x-bf-vk header is set correctly.

Rate Limit Errors​

429 Rate Limit Exceeded​

Exceeded your rate limit or budget.

{
"error": {
"message": "Rate limit exceeded: 100 requests/hour",
"type": "rate_limit_error",
"code": "rate_limit_exceeded"
}
}

Solution:

  • Implement exponential backoff
  • Check your Virtual Key limits
  • Consider upgrading your plan

Request Errors​

400 Bad Request​

Invalid request parameters.

{
"error": {
"message": "Invalid model name",
"type": "invalid_request_error",
"param": "model",
"code": "invalid_model"
}
}

Common causes:

  • Invalid model name (use provider/model-name format)
  • Invalid X-Savings-Level value
  • Missing required fields

Budget Errors​

402 Payment Required​

Budget exceeded.

{
"error": {
"message": "Budget exceeded: $50.00 limit reached",
"type": "budget_exceeded_error",
"code": "budget_exceeded"
}
}

Solution:

  • Wait for budget reset (monthly/weekly)
  • Increase budget limit
  • Contact administrator

Server Errors​

500 Internal Server Error​

Unexpected server error.

{
"error": {
"message": "Internal server error",
"type": "server_error",
"code": "internal_error"
}
}

Solution:

503 Service Unavailable​

Service temporarily unavailable.

{
"error": {
"message": "Service temporarily unavailable",
"type": "server_error",
"code": "service_unavailable"
}
}

Solution:

  • Retry after a delay
  • Check status page
  • Implement retry logic

HTTP Status Codes​

CodeMeaningDescription
200OKRequest successful
400Bad RequestInvalid request parameters
401UnauthorizedInvalid or missing API key
402Payment RequiredBudget exceeded
429Too Many RequestsRate limit exceeded
500Internal Server ErrorServer error
503Service UnavailableService temporarily down

Error Types​

TypeDescription
invalid_request_errorInvalid request parameters
invalid_api_keyInvalid or missing API key
rate_limit_errorRate limit exceeded
budget_exceeded_errorBudget limit reached
server_errorServer-side error
timeout_errorRequest timeout

Best Practices​

1. Implement Retry Logic​

import time
from openai import OpenAI

client = OpenAI(
base_url="http://localhost:8084/v1",
api_key="sk-bf-YOUR_VIRTUAL_KEY"
)

def create_completion_with_retry(messages, max_retries=3):
for attempt in range(max_retries):
try:
return client.chat.completions.create(
model="anthropic/claude-sonnet-4-5-20250929",
messages=messages,
max_tokens=500
)
except Exception as e:
if attempt < max_retries - 1:
time.sleep(2 ** attempt) # Exponential backoff
else:
raise

2. Check Error Headers​

response = client.chat.completions.create(...)

# Check rate limit headers
rate_limit_remaining = response.response_headers.get('X-RateLimit-Remaining')
rate_limit_reset = response.response_headers.get('X-RateLimit-Reset')

if int(rate_limit_remaining) < 10:
# Slow down requests
time.sleep(1)

3. Handle Specific Errors​

try:
response = client.chat.completions.create(...)
except Exception as e:
error_code = getattr(e, 'code', None)

if error_code == 'rate_limit_exceeded':
# Implement backoff
time.sleep(60)
elif error_code == 'budget_exceeded':
# Notify user
print("Budget exceeded. Please wait for reset.")
else:
raise

4. Monitor Savings Headers​

response = client.chat.completions.create(...)

# Log optimization metrics
original_tokens = response.response_headers.get('X-Korad-Original-Tokens')
optimized_tokens = response.response_headers.get('X-Korad-Optimized-Tokens')
savings = response.response_headers.get('X-Korad-Savings-USD')

print(f"Saved ${savings} ({original_tokens} -> {optimized_tokens} tokens)")

Handle errors gracefully for a better user experience.