Authentication Deep Dive

Behest uses a two-token authentication model: a long-lived API key that lives on your server, and short-lived RS256 JWTs that travel with each inference request.

Overview

Client App (server-side)
    │
    │  POST /auth/v1/auth/mint
    │  Authorization: Bearer behest_sk_live_...
    │  Body: { user_id, ttl }
    ▼
behest-auth
    • SHA-256 lookup → fetch row → Argon2id verify
    • Sign RS256 JWT with claims: tid, pid, uid, role, tier
    ▼
    { access_token, expires_in }
    │
Client App
    │
    │  POST /v1/chat/completions
    │  Authorization: Bearer eyJhbGci...
    │  Host: {slug}.behest.app
    ▼
Kong (behest-tenant-auth plugin)
    • Parse slug from Host → Redis lookup → resolve tid/pid
    • Fetch JWKS from /.well-known/jwks.json (5-min cache)
    • Verify RS256 signature
    • Validate exp/nbf/iat claims
    • Check kill switches
    • Enforce rate limits
    • Inject headers: X-Tenant-Id, X-Project-Id, X-End-User-Id, X-Role, X-Tier
    ▼
LiteLLM / upstream service

API Key Format and Storage

API keys follow the format behest_sk_live_{32 hex chars}. They are generated using crypto.randomUUID() with hyphens stripped.

Two-phase lookup for performance:

A SHA-256 hash of the plaintext key is stored as key_lookup — used for O(1) database index lookup without Argon2id cost
An Argon2id hash of the plaintext key is stored as key_hash — verified after the index lookup to prevent preimage attacks

findByApiKey(apiKey):
  lookup = SHA-256(apiKey)
  row = SELECT WHERE key_lookup = lookup
  if row:
    valid = argon2id.verify(apiKey, row.key_hash)
    return valid ? row : null

This means a database index scan is cheap (SHA-256), but brute-forcing stored hashes remains computationally infeasible (Argon2id).

JWT Minting Flow

The mint endpoint is at POST /auth/v1/auth/mint (no auth middleware — it is a public endpoint protected only by the API key in the Authorization header).

Request validation:

user_id is required, must be 1–255 characters, cannot be a reserved value (dashboard-service, admin, system, internal, service, litellm, kong, behest) or start with svc:
role must be one of: user, dashboard-service, admin
ttl must be an integer between 60 and 86400 seconds

After key verification:

Project status is checked — suspended projects return 401
The slug→project Redis mapping is refreshed (permanent, no TTL)
If the project has been deployed at least once, all project settings are synced to Redis with the JWT's TTL as expiry — this ensures Kong always has up-to-date config for the token's lifetime

Signed payload:

typescript

{
  tid: string,          // tenantId
  pid: string,          // projectId
  uid: string,          // user_id from request
  role: string,         // "user" | "dashboard-service" | "admin"
  scp: string[],        // scopes (always [] currently)
  iss: string,          // process.env.JWT_ISSUER || "https://api.behest.app"
  aud: string,          // process.env.JWT_AUDIENCE || "behest"
  iat: number,          // Unix timestamp (seconds)
  nbf: number,          // same as iat
  exp: number,          // iat + ttl
  jti: string,          // UUIDv4 (unique token ID)
  tier?: string,        // only present if tier was specified in the request
}

The JWT is signed with RS256 using the private key from JWT_PRIVATE_KEY env var. The kid header field is set to JWT_KEY_ID env var (default: "default").

JWKS Endpoint

GET /.well-known/jwks.json

No authentication required. Returns the RSA public key in JWK Set format so that Kong and any external verifier can validate Behest JWTs.

Response shape:

json

{
  "keys": [
    {
      "kty": "RSA",
      "n": "...",
      "e": "AQAB",
      "kid": "default",
      "use": "sig",
      "alg": "RS256"
    }
  ]
}

Kong caches this response for 5 minutes (configurable via JWKS_CACHE_TTL in the plugin). On cache miss or fetch failure, Kong uses a stale cached value for up to 1 hour before failing hard.

To verify a Behest JWT externally:

typescript

import { createRemoteJWKSet, jwtVerify } from "jose";
 
const JWKS = createRemoteJWKSet(
  new URL("https://api.behest.app/.well-known/jwks.json")
);
 
const { payload } = await jwtVerify(token, JWKS, {
  issuer: "https://api.behest.app",
  audience: "behest",
});

Kong Plugin: Header Injection

After successful JWT verification, the behest-tenant-auth Kong plugin injects the following headers before forwarding the request upstream:

Header	Source	Example
`X-Tenant-Id`	JWT `tid` claim	`550e8400-...`
`X-Project-Id`	JWT `pid` claim	`663e8400-...`
`X-End-User-Id`	JWT `uid` claim	`user-123`
`X-Role`	JWT `role` claim	`user`
`X-Tier`	JWT `tier` claim	`premium` (empty if not set)
`X-Scopes`	JWT `scp` claim (JSON-encoded)	`[]`
`X-Auth-Provider`	Auth path taken	`behest` or `supabase`
`x-request-id`	Generated per-request UUID	`abc12345-...`
`X-Session-Id`	Passed through from client	(if provided)
`X-Thread-Id`	Passed through from client	(if provided)

Important: Client-supplied X-Tenant-Id and X-Project-Id headers are stripped by the plugin and replaced with values derived from the verified JWT claims or slug resolution. Clients cannot impersonate other tenants or projects.

Supabase JWT Support

If you are building on Supabase and want to use Supabase-issued JWTs directly (without minting Behest JWTs), Behest supports this via behest-supabase-sync.

How it works:

behest-supabase-sync periodically fetches JWKS from each Supabase project's JWKS endpoint and stores them in Redis at jwks:supabase:{tenantId}:{projectId}
When the Kong plugin receives a JWT, it checks whether the kid matches a Behest JWKS key first; if not, it falls back to the Supabase JWKS for the resolved project
Supabase JWTs do not have tid/pid claims — these are resolved from the slug/hostname lookup instead

This path is transparent to LiteLLM. The injected headers look the same regardless of whether the JWT was Behest-issued or Supabase-issued.

Token Expiry and Refresh

JWTs expire at the exp claim timestamp. Once expired, Kong rejects the token with 401 Unauthorized.

Recommended pattern: mint a new JWT before the current one expires, not after it is rejected. This avoids latency spikes caused by a user request hitting an expired token.

typescript

class TokenManager {
  private token: string | null = null;
  private expiresAt = 0;
 
  async getToken(userId: string): Promise<string> {
    // Refresh 60 seconds before expiry
    if (!this.token || Date.now() / 1000 > this.expiresAt - 60) {
      const result = await mintToken(userId);
      this.token = result.access_token;
      this.expiresAt = Date.now() / 1000 + result.expires_in;
    }
    return this.token;
  }
}

For server-to-server use cases where you are minting tokens for many users, cache per userId.

Dashboard Service Authentication

The Behest dashboard (Next.js BFF) uses a special service key (DASHBOARD_SERVICE_API_KEY) to authenticate its own API calls to behest-auth. This bypasses the reserved user_id check so the dashboard can use identifiers like "dashboard-service".

Regular API key holders cannot use reserved user IDs. Attempting to mint a JWT with user_id: "admin" returns a 400 validation error.

Security Best Practices

Never expose your API key client-side. The API key should only exist in your server environment. The JWT is what you pass to the browser or mobile client.
Set a user_id per end user. This enables per-user rate limiting and token budget enforcement. Without a user_id, per-user controls do not apply.
Use short TTLs for high-risk operations. The minimum TTL is 60 seconds; the default is 3600. Reduce TTL for sensitive workloads.
Rotate API keys periodically. Use POST /auth/v1/projects/:projectId/api-keys/:keyId/rotate to atomically replace a key. The old key is revoked the instant the new one is issued.
Revoke keys when decommissioning a project or team member. Revoked keys cannot be un-revoked; create a new key if access needs to be restored.
Do not log JWTs. They contain tenant and user identifiers. Treat them like passwords in your logging pipeline.