Bring Your Own Key (BYOK)
What Is BYOK?
BYOK (Bring Your Own Key) means you supply your own API keys for OpenAI, Anthropic, Google, Mistral, Cohere, or OpenRouter. When your application makes an inference request through Behest, your key is used to call the provider directly. Behest never acts as an intermediary on the token billing side — you pay the provider at their published rates.
Why BYOK matters:
- Zero LLM markup — Behest charges a flat subscription fee. Your LLM tokens cost exactly what the provider charges, nothing more.
- Your rate limits — requests run against your provider account's rate limits, not shared pool limits.
- Your billing relationship — usage appears in your provider account for cost attribution, chargebacks, and auditing.
- Model flexibility — access any model in your provider account, not just a curated subset.
- Data handling terms — your data is subject to your provider agreement, not Behest's.
BYOK is available on Pro, Business+, and Enterprise plans.
Architectural Overview
Your App
│
│ POST /v1/chat/completions
│ Authorization: Bearer <Behest JWT>
│ Body: { model: "gpt-4o", messages: [...] }
│
▼
Kong Gateway
│ Validates Behest JWT (RS256)
│ Enforces RPM rate limits (Redis INCR)
│ Checks kill switches
│ Injects X-Tenant-Id, X-Project-Id headers
│
▼
LiteLLM (custom_auth.py — auth_from_headers)
│
│ 1. Read config:{pid}:provider_model from Redis
│ → "gpt-4o"
│
│ 2. Infer provider from model ID
│ → "openai" (from routing:model_to_provider Redis hash
│ or fallback hardcoded mapping)
│
│ 3. Read provider:{tid}:{providerType}:api_key_enc from Redis
│ → AES-256-GCM ciphertext
│
│ 4. decrypt_provider_key(ciphertext)
│ → plaintext key (exists only in this narrow scope)
│
│ 5. Inject { api_key: plaintext } into LiteLLM params
│ del plaintext ← immediately deleted
│
▼
OpenAI API (or Anthropic, Google, etc.)
Authorization: Bearer <your-key>
The decrypted key lives in memory for the duration of the auth_from_headers call only. It is passed to LiteLLM's pre-call hook via metadata["_byoak_inject"], which pops it before the metadata is persisted to usage logs. It is never written to a log file, database, span attribute, or response body.
Security Model
Encryption
Keys are encrypted with AES-256-GCM using a 32-byte key loaded from the PROVIDER_ENCRYPTION_KEY environment variable. The service fails closed at startup if this variable is missing, empty, or not a valid 64-character hex string.
Ciphertext format:
{iv_hex}:{ciphertext_hex}:{auth_tag_hex}
- IV: 12 bytes (96 bits), randomly generated per key using the OS CSPRNG (
crypto.randomBytes). IVs are never reused. - Auth tag: 16 bytes (128 bits). GCM's authentication tag detects any ciphertext tampering — decryption throws if the tag does not match.
- Key length: 32 bytes (256-bit AES). Loaded from
PROVIDER_ENCRYPTION_KEY(64 hex chars).
In production (GCP/GKE), PROVIDER_ENCRYPTION_KEY is stored in GCP Secret Manager and injected as a Kubernetes secret. It is never in source code or in the container image.
Tenant Isolation
Redis key format: provider:{tenantId}:{providerType}:api_key_enc
The tenantId is a UUID validated by a strict regex (/^[0-9a-f]{8}-[0-9a-f]{4}-...$/) before any Redis lookup. A crafted non-UUID value is rejected with HTTP 400 before Redis is touched, preventing cross-tenant key injection.
Kong validates the Behest JWT (RS256) before the request reaches LiteLLM, extracting a verified tid claim. Client-supplied X-Tenant-Id headers are stripped by Kong — only Kong-injected headers are trusted.
Key Visibility
- The
api_key_enccolumn is never returned in any API response (enforced by explicit field selection —api_key_encis excluded from everySELECT) - Only
key_last4(last 4 characters) is returned to the UI for identification - The
PUTrequest body fieldapi_keyis immediately deleted from the parsed body object after extraction to prevent it from appearing in request logs
Production Secret Management
In GKE production, the following secrets are managed via GCP Secret Manager and mounted as Kubernetes Secrets:
PROVIDER_ENCRYPTION_KEY— AES-256-GCM key for provider key encryption- Provider keys themselves never appear in environment variables — they are stored encrypted in PostgreSQL and cached encrypted in Redis
Provider Key Lifecycle
Adding a Key
When you submit a key via PUT /v1/tenants/:tenantId/providers/:providerType:
- Format validation — key is checked against the provider's regex (e.g., OpenAI keys must match
sk-(proj-|svcacct-)?[A-Za-z0-9_-]{20,}) - Live validation — a real API call to the provider (5s timeout) confirms the key is active
- Encryption —
encryptProviderKey()generates a fresh IV, encrypts with AES-256-GCM - Database upsert — stored in
tenant_provider_keyswithON CONFLICT DO UPDATE(rotation = same operation as initial add) - Redis write —
provider:{tid}:{providerType}:api_key_encset with a 24-hour TTL (refreshed hourly byredis-sync-worker) - Redis membership —
SADD provider:{tid}:enabled_providers {providerType} - Model discovery — fire-and-forget background job calls the provider's model list endpoint and populates
tenant_provider_models
The key takes effect immediately (no deploy step required). Provider keys are tenant-level, not project-level.
Rotating a Key
Send a new PUT with the new key. The upsert replaces the existing ciphertext and key_last4. Redis is updated atomically. The old key is gone from Behest's storage the moment the write completes.
Revoking a Key
DELETE /v1/tenants/:tenantId/providers/:providerType — removes the database row and atomically deletes both Redis keys (api_key_enc and api_base) from a pipeline. Takes effect immediately.
Projects configured to use this provider's models silently fall back to gemini-2.5-flash (platform default) on the next request. Their provider_model database setting is preserved.
What Happens When a Key Expires or Is Invalid
At request time, if custom_auth.py loads the ciphertext from Redis and decryption fails (tampered ciphertext, wrong key), it returns HTTP 500 to the client without revealing any key material. The error is logged with the project ID and provider type.
If the provider rejects the key (e.g., you revoked it at the provider's dashboard), LiteLLM receives a 401 from the provider API and returns an appropriate error to your application. The stale key remains in Behest's storage until you rotate or remove it.
If Redis has no ciphertext (e.g., Redis restarted before the hourly redis-sync-worker refresh), custom_auth.py falls back to the platform default instead of returning an error. The redis-sync-worker restores all provider key ciphertexts from PostgreSQL within its next cycle (up to 1 hour + 0–10% jitter).
Clear error for misconfigured projects: If a project is configured for a model but no key exists for that provider, custom_auth.py raises:
{ "detail": "Configure your openai API key in Provider Settings to use gpt-4o" }HTTP 400, never a silent fallback in this case.
Testing Before Deploying
The test-token flow lets you validate BYOK end-to-end before affecting production:
POST /v1/projects/:projectId/settings/test-token
Authorization: Bearer <service-JWT>This returns a short-lived JWT (5-minute TTL). Requests using this JWT include X-Behest-Draft-Mode: 1, which tells custom_auth.py to read draft:config:{pid}:provider_model (your draft model selection) rather than config:{pid}:provider_model (the deployed model).
Draft keys are written with a 300-second TTL and never overwrite production keys. After 5 minutes they expire automatically. Use this to confirm:
- Your provider key decrypts successfully
- The correct model is selected
- Responses look as expected before deploying
Supported Key Formats Per Provider
| Provider | Format | Example prefix | Length | Live validation endpoint |
|---|---|---|---|---|
| OpenAI | sk-(proj-|svcacct-)?[A-Za-z0-9_-]{20,} | sk-proj-... | variable | GET api.openai.com/v1/models |
| Anthropic | sk-ant-[A-Za-z0-9_-]{20,} | sk-ant-api03-... | variable | GET api.anthropic.com/v1/models |
AIza[A-Za-z0-9_-]{35} | AIzaSy... | exactly 39 chars | GET generativelanguage.googleapis.com/v1beta/models?key=... | |
| OpenRouter | sk-or-v1-[a-f0-9]{64} | sk-or-v1-... | exactly 77 chars | GET openrouter.ai/api/v1/auth/key |
| Mistral | any 10+ chars | varies | variable | GET api.mistral.ai/v1/models |
| Cohere | any 10+ chars | varies | variable | GET api.cohere.com/v1/models |
Format validation is performed as a regex pre-check before the live API call. If the format doesn't match, Behest returns 400 INVALID_KEY_FORMAT without making a network request to the provider.