Provider Setup Guide (BYOK)
Behest supports Bring Your Own Key (BYOK) — you connect your own provider API keys, and Behest routes your inference requests using those keys. Behest never marks up your LLM costs; you pay the provider directly at their published rates.
BYOK is available on Pro, Business+, and Enterprise plans. Free-plan accounts use the platform's shared Gemini key.
Supported Providers
| Provider | Auth method | Key format | LiteLLM prefix |
|---|---|---|---|
| OpenAI | Bearer API key | sk-... or sk-proj-... | none (native) |
| Anthropic | x-api-key header | sk-ant-... | anthropic/ |
| Google Gemini | Query-param API key | AIza... (39 chars) | gemini/ |
| Mistral | Bearer API key | 10+ chars (no fixed prefix) | mistral/ |
| Cohere | Bearer API key | 10+ chars (no fixed prefix) | cohere/ |
| OpenRouter | Bearer API key | sk-or-v1-[64 hex chars] | openrouter/ |
AWS Bedrock and Azure OpenAI are in the integration roadmap. Contact support if you need early access.
How Key Storage Works
Every provider key is encrypted with AES-256-GCM before it touches the database. The ciphertext format is:
{iv_hex}:{ciphertext_hex}:{auth_tag_hex}
- IV: 12 bytes (96 bits), randomly generated per key using the OS CSPRNG
- Auth tag: 16 bytes (128 bits, GCM default) — any tampering is detected at decryption time
- Encryption key: loaded from
PROVIDER_ENCRYPTION_KEYenvironment variable (32 bytes / 64 hex chars) at service startup; fails closed if missing or malformed
The plaintext key is never logged, never returned in any API response, and never stored. The only time it exists in memory is during the encryption write path (milliseconds) and during decryption in custom_auth.py for a single request. After decryption, the plaintext reference is immediately deleted.
Keys are stored in the tenant_provider_keys table:
- One row per provider per tenant (
UNIQUE(tenant_id, provider_type)) api_key_enc— the AES-256-GCM ciphertext (never returned in API responses)key_last4— last 4 characters of the plaintext at write time, for display only (e.g....xK9z)key_set_at— timestamp of the most recent save or rotation
Redis stores a copy of the ciphertext at provider:{tenantId}:{providerType}:api_key_enc with a 24-hour TTL, refreshed hourly by the redis-sync-worker.
Adding a Provider Key
Dashboard
- Go to Account Settings → Provider Keys
- Select the provider from the dropdown
- Paste your API key into the field
- Click Save — Behest validates the key against the provider API before storing it
API
PUT /v1/tenants/:tenantId/providers/:providerType
Authorization: Bearer <service-JWT>
Content-Type: application/json
{
"api_key": "sk-proj-..."
}Response (200):
{
"configured": true,
"provider_type": "openai",
"key_last4": "k9Zq",
"key_set_at": "2026-03-30T12:00:00.000Z"
}The api_key field is never echoed back. The response only confirms the last 4 characters.
Error codes:
| HTTP | Code | Meaning |
|---|---|---|
| 400 | INVALID_KEY_FORMAT | Key does not match the provider's expected format |
| 403 | BYOAK_REQUIRES_PRO | Your plan does not include BYOK |
| 422 | KEY_VALIDATION_FAILED | Key was rejected by the provider API |
Validation behavior: Behest performs a live validation call to the provider API (5-second timeout). If the provider API is unreachable or times out, the call is treated as valid and the key is stored (fail-open). This prevents a provider outage from blocking your key rotation. A 401 from the provider is the only signal that definitively marks a key as invalid.
Removing a Provider Key
Dashboard
Go to Account Settings → Provider Keys → click Remove next to the provider.
API
DELETE /v1/tenants/:tenantId/providers/:providerType
Authorization: Bearer <service-JWT>Response: 204 No Content
Removal takes effect immediately — the Redis entry is deleted atomically with the database row. Projects that had this provider's models configured will silently fall back to the platform default (Gemini 2.5 Flash) on the next request. Their provider_model setting is preserved in the database (intent is kept); it just has no key to activate against.
Listing Configured Keys
GET /v1/tenants/:tenantId/providers
Authorization: Bearer <service-JWT>Response:
{
"providers": [
{
"provider_type": "openai",
"key_last4": "k9Zq",
"key_set_at": "2026-03-30T12:00:00.000Z",
"projects_using_count": 3
}
]
}Provider-Specific Notes
OpenAI
Get a key: platform.openai.com/api-keys
Key formats accepted:
sk-[20+ chars]— legacy project keyssk-proj-[20+ chars]— project-scoped keys (recommended)sk-svcacct-[20+ chars]— service account keys
Validation endpoint: GET https://api.openai.com/v1/models with Authorization: Bearer {key}. A 401 means invalid; 403 and 429 mean the key is valid but rate-limited or restricted.
Supported models (selected):
| Model ID | Context | Streaming | Vision | Tool use |
|---|---|---|---|---|
gpt-4o | 128K | Yes | Yes | Yes |
gpt-4o-mini | 128K | Yes | Yes | Yes |
gpt-4.1 | 1M | Yes | Yes | Yes |
gpt-4.1-mini | 1M | Yes | No | Yes |
o3 | 200K | Yes | No | No |
o3-mini | 200K | Yes | No | No |
o4-mini | 200K | Yes | No | Yes |
Rate limits depend on your OpenAI tier (Tier 1 through Tier 5, based on cumulative spend). New keys start at Tier 1 (500 RPM, 200K TPM). See platform.openai.com/docs/guides/rate-limits.
Anthropic
Get a key: console.anthropic.com/settings/keys
Key format: sk-ant-[20+ chars]
Validation endpoint: GET https://api.anthropic.com/v1/models with x-api-key: {key} and anthropic-version: 2023-06-01. A 401 is invalid; 403 and 529 mean valid but restricted.
Supported models:
| Model ID | Context | Streaming | Vision | Tool use |
|---|---|---|---|---|
claude-opus-4-20250514 | 200K | Yes | Yes | Yes |
claude-sonnet-4-20250514 | 200K | Yes | Yes | Yes |
claude-haiku-4-5-20251001 | 200K | Yes | Yes | Yes |
claude-3-5-haiku-20241022 | 200K | Yes | Yes | Yes |
LiteLLM prepends anthropic/ to the model ID before forwarding (e.g., claude-sonnet-4-20250514 becomes anthropic/claude-sonnet-4-20250514).
Note: Anthropic does not offer embedding models. Requests for embeddings must use a different provider.
Google Gemini
Get a key: aistudio.google.com/app/apikey
Key format: AIza[35 alphanumeric chars] (exactly 39 characters)
Validation endpoint: GET https://generativelanguage.googleapis.com/v1beta/models?key={key}. A 400 or 403 means invalid; 429 means valid but rate-limited.
Supported models:
| Model ID | Context | Streaming | Vision | Tool use |
|---|---|---|---|---|
gemini-2.5-pro | 1M | Yes | Yes | Yes |
gemini-2.5-flash | 1M | Yes | Yes | Yes |
gemini-2.0-flash | 1M | Yes | Yes | Yes |
gemini-1.5-pro | 1M | Yes | Yes | Yes |
LiteLLM prepends gemini/ to the model ID (e.g., gemini-2.5-flash becomes gemini/gemini-2.5-flash).
The Google AI Studio free tier offers 1,500 requests/day and 1M TPM — useful for development. Pay-as-you-go unlocks 2,000 RPM.
Note: Google is also the platform default provider. If no BYOK key is configured for a project, Behest routes to gemini-2.5-flash using the platform's own key.
Mistral
Get a key: console.mistral.ai/api-keys
Key format: Any string of 10 or more characters. Mistral does not use a predictable prefix — Behest validates only via live API call.
Validation endpoint: GET https://api.mistral.ai/v1/models with Authorization: Bearer {key}. A 401 means invalid.
Supported models:
| Model ID | Context | Streaming | Vision | Tool use |
|---|---|---|---|---|
mistral-large-latest | 128K | Yes | Yes | Yes |
mistral-small-latest | 128K | Yes | No | Yes |
codestral-latest | 256K | Yes | No | Yes |
pixtral-large-latest | 128K | Yes | Yes | Yes |
LiteLLM prepends mistral/ to the model ID. Mistral is a European provider (Paris); EU-based tenants may prefer it for data residency considerations.
Cohere
Get a key: dashboard.cohere.com/api-keys
Key format: Any string of 10 or more characters. Cohere keys are validated only via live API call.
Validation endpoint: GET https://api.cohere.com/v1/models with Authorization: Bearer {key}. A 401 or 403 means invalid.
Supported models:
| Model ID | Context | Streaming | Vision | Tool use |
|---|---|---|---|---|
command-r-plus-08-2024 | 128K | Yes | No | Yes |
command-r-08-2024 | 128K | Yes | No | Yes |
command-light | 4K | Yes | No | No |
LiteLLM prepends cohere/ to the model ID. Note: Cohere does not support vision (image) inputs.
OpenRouter
Get a key: openrouter.ai/keys
Key format: sk-or-v1-[64 lowercase hex chars] (exactly 77 characters)
Validation endpoint: GET https://openrouter.ai/api/v1/auth/key with Authorization: Bearer {key}.
OpenRouter is a meta-router that provides access to hundreds of models from multiple providers under a single key. When using OpenRouter, specify model IDs in org/model format (e.g., openai/gpt-4o, anthropic/claude-sonnet-4-20250514).
LiteLLM prepends openrouter/ to the model ID. Behest adds the required HTTP-Referer: https://behest.ai and X-Title: Behest headers automatically.
Selected OpenRouter models:
| Model ID | Provider |
|---|---|
openai/gpt-4o | OpenAI via OpenRouter |
anthropic/claude-sonnet-4-20250514 | Anthropic via OpenRouter |
google/gemini-2.5-flash | Google via OpenRouter |
deepseek/deepseek-r1 | DeepSeek |
x-ai/grok-3 | xAI |
meta-llama/llama-3.3-70b-instruct | Meta via OpenRouter |
Key Rotation
To rotate a key, simply PUT a new key to the same endpoint. Behest performs an upsert (ON CONFLICT DO UPDATE) — the existing ciphertext is replaced atomically. The Redis entry is updated immediately. There is no downtime window; requests continue using the old key until the Redis write completes (typically under 1ms).
Testing a Key Before Deploying
Use the test-token endpoint to issue a short-lived (5-minute) JWT that exercises the BYOK path without touching your production deploy:
POST /v1/projects/:projectId/settings/test-token
Authorization: Bearer <service-JWT>This writes draft config keys (draft:config:{pid}:*) to Redis with a 300-second TTL. Requests using the draft token read from these keys, keeping test traffic isolated from production settings.