Skip to main content

    Authentication Deep Dive

    Behest uses a two-token authentication model: a long-lived API key that lives on your server, and short-lived RS256 JWTs that travel with each inference request.


    Overview

    Client App (server-side)
        │
        │  POST /auth/v1/auth/mint
        │  Authorization: Bearer behest_sk_live_...
        │  Body: { user_id, ttl }
        ▼
    behest-auth
        • SHA-256 lookup → fetch row → Argon2id verify
        • Sign RS256 JWT with claims: tid, pid, uid, role, tier
        ▼
        { access_token, expires_in }
        │
    Client App
        │
        │  POST /v1/chat/completions
        │  Authorization: Bearer eyJhbGci...
        │  Host: {slug}.behest.app
        ▼
    Kong (behest-tenant-auth plugin)
        • Parse slug from Host → Redis lookup → resolve tid/pid
        • Fetch JWKS from /.well-known/jwks.json (5-min cache)
        • Verify RS256 signature
        • Validate exp/nbf/iat claims
        • Check kill switches
        • Enforce rate limits
        • Inject headers: X-Tenant-Id, X-Project-Id, X-End-User-Id, X-Role, X-Tier
        ▼
    LiteLLM / upstream service
    

    API Key Format and Storage

    API keys follow the format behest_sk_live_{32 hex chars}. They are generated using crypto.randomUUID() with hyphens stripped.

    Two-phase lookup for performance:

    1. A SHA-256 hash of the plaintext key is stored as key_lookup — used for O(1) database index lookup without Argon2id cost
    2. An Argon2id hash of the plaintext key is stored as key_hash — verified after the index lookup to prevent preimage attacks
    findByApiKey(apiKey):
      lookup = SHA-256(apiKey)
      row = SELECT WHERE key_lookup = lookup
      if row:
        valid = argon2id.verify(apiKey, row.key_hash)
        return valid ? row : null
    

    This means a database index scan is cheap (SHA-256), but brute-forcing stored hashes remains computationally infeasible (Argon2id).


    JWT Minting Flow

    The mint endpoint is at POST /auth/v1/auth/mint (no auth middleware — it is a public endpoint protected only by the API key in the Authorization header).

    Request validation:

    • user_id is required, must be 1–255 characters, cannot be a reserved value (dashboard-service, admin, system, internal, service, litellm, kong, behest) or start with svc:
    • role must be one of: user, dashboard-service, admin
    • ttl must be an integer between 60 and 86400 seconds

    After key verification:

    1. Project status is checked — suspended projects return 401
    2. The slug→project Redis mapping is refreshed (permanent, no TTL)
    3. If the project has been deployed at least once, all project settings are synced to Redis with the JWT's TTL as expiry — this ensures Kong always has up-to-date config for the token's lifetime

    Signed payload:

    typescript
    {
      tid: string,          // tenantId
      pid: string,          // projectId
      uid: string,          // user_id from request
      role: string,         // "user" | "dashboard-service" | "admin"
      scp: string[],        // scopes (always [] currently)
      iss: string,          // process.env.JWT_ISSUER || "https://api.behest.app"
      aud: string,          // process.env.JWT_AUDIENCE || "behest"
      iat: number,          // Unix timestamp (seconds)
      nbf: number,          // same as iat
      exp: number,          // iat + ttl
      jti: string,          // UUIDv4 (unique token ID)
      tier?: string,        // only present if tier was specified in the request
    }

    The JWT is signed with RS256 using the private key from JWT_PRIVATE_KEY env var. The kid header field is set to JWT_KEY_ID env var (default: "default").


    JWKS Endpoint

    GET /.well-known/jwks.json
    

    No authentication required. Returns the RSA public key in JWK Set format so that Kong and any external verifier can validate Behest JWTs.

    Response shape:

    json
    {
      "keys": [
        {
          "kty": "RSA",
          "n": "...",
          "e": "AQAB",
          "kid": "default",
          "use": "sig",
          "alg": "RS256"
        }
      ]
    }

    Kong caches this response for 5 minutes (configurable via JWKS_CACHE_TTL in the plugin). On cache miss or fetch failure, Kong uses a stale cached value for up to 1 hour before failing hard.

    To verify a Behest JWT externally:

    typescript
    import { createRemoteJWKSet, jwtVerify } from "jose";
     
    const JWKS = createRemoteJWKSet(
      new URL("https://api.behest.app/.well-known/jwks.json")
    );
     
    const { payload } = await jwtVerify(token, JWKS, {
      issuer: "https://api.behest.app",
      audience: "behest",
    });

    Kong Plugin: Header Injection

    After successful JWT verification, the behest-tenant-auth Kong plugin injects the following headers before forwarding the request upstream:

    HeaderSourceExample
    X-Tenant-IdJWT tid claim550e8400-...
    X-Project-IdJWT pid claim663e8400-...
    X-End-User-IdJWT uid claimuser-123
    X-RoleJWT role claimuser
    X-TierJWT tier claimpremium (empty if not set)
    X-ScopesJWT scp claim (JSON-encoded)[]
    X-Auth-ProviderAuth path takenbehest or supabase
    x-request-idGenerated per-request UUIDabc12345-...
    X-Session-IdPassed through from client(if provided)
    X-Thread-IdPassed through from client(if provided)

    Important: Client-supplied X-Tenant-Id and X-Project-Id headers are stripped by the plugin and replaced with values derived from the verified JWT claims or slug resolution. Clients cannot impersonate other tenants or projects.


    Supabase JWT Support

    If you are building on Supabase and want to use Supabase-issued JWTs directly (without minting Behest JWTs), Behest supports this via behest-supabase-sync.

    How it works:

    1. behest-supabase-sync periodically fetches JWKS from each Supabase project's JWKS endpoint and stores them in Redis at jwks:supabase:{tenantId}:{projectId}
    2. When the Kong plugin receives a JWT, it checks whether the kid matches a Behest JWKS key first; if not, it falls back to the Supabase JWKS for the resolved project
    3. Supabase JWTs do not have tid/pid claims — these are resolved from the slug/hostname lookup instead

    This path is transparent to LiteLLM. The injected headers look the same regardless of whether the JWT was Behest-issued or Supabase-issued.


    Token Expiry and Refresh

    JWTs expire at the exp claim timestamp. Once expired, Kong rejects the token with 401 Unauthorized.

    Recommended pattern: mint a new JWT before the current one expires, not after it is rejected. This avoids latency spikes caused by a user request hitting an expired token.

    typescript
    class TokenManager {
      private token: string | null = null;
      private expiresAt = 0;
     
      async getToken(userId: string): Promise<string> {
        // Refresh 60 seconds before expiry
        if (!this.token || Date.now() / 1000 > this.expiresAt - 60) {
          const result = await mintToken(userId);
          this.token = result.access_token;
          this.expiresAt = Date.now() / 1000 + result.expires_in;
        }
        return this.token;
      }
    }

    For server-to-server use cases where you are minting tokens for many users, cache per userId.


    Dashboard Service Authentication

    The Behest dashboard (Next.js BFF) uses a special service key (DASHBOARD_SERVICE_API_KEY) to authenticate its own API calls to behest-auth. This bypasses the reserved user_id check so the dashboard can use identifiers like "dashboard-service".

    Regular API key holders cannot use reserved user IDs. Attempting to mint a JWT with user_id: "admin" returns a 400 validation error.


    Security Best Practices

    • Never expose your API key client-side. The API key should only exist in your server environment. The JWT is what you pass to the browser or mobile client.
    • Set a user_id per end user. This enables per-user rate limiting and token budget enforcement. Without a user_id, per-user controls do not apply.
    • Use short TTLs for high-risk operations. The minimum TTL is 60 seconds; the default is 3600. Reduce TTL for sensitive workloads.
    • Rotate API keys periodically. Use POST /auth/v1/projects/:projectId/api-keys/:keyId/rotate to atomically replace a key. The old key is revoked the instant the new one is issued.
    • Revoke keys when decommissioning a project or team member. Revoked keys cannot be un-revoked; create a new key if access needs to be restored.
    • Do not log JWTs. They contain tenant and user identifiers. Treat them like passwords in your logging pipeline.

    Enterprise Token FinOps: Enforce hard budgets and attribute costs per session.

    Learn more