Documentation
Everything you need to build with Behest: the unified AI backend with Token FinOps and Enterprise AI Governance.
Quickstart
TypeScript / JavaScript
npm install openaiimport OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://api.behest.ai/v1",
apiKey: "your-api-key",
defaultHeaders: {
"X-End-User-Id": userId,
// Uniquely identifies a conversation thread for per-session cost attribution.
"X-Session-Id": `user-${userId}-conv-${conversationId}`,
},
});
const completion = await client.chat.completions.create({
model: "gemini-2.5-flash",
messages: [{ role: "user", content: userMessage }],
});
console.log(completion.choices[0].message.content);Python
pip install openaifrom openai import OpenAI
client = OpenAI(
base_url="https://api.behest.ai/v1",
api_key="your-api-key",
default_headers={
"X-End-User-Id": user_id,
# Uniquely identifies a conversation thread for per-session cost attribution.
"X-Session-Id": f"user-{user_id}-conv-{conversation_id}",
},
)
completion = client.chat.completions.create(
model="gemini-2.5-flash",
messages=[{"role": "user", "content": user_message}],
)
print(completion.choices[0].message.content)cURL
POST /v1/chat/completionscurl -X POST https://api.behest.ai/v1/chat/completions \
-H "Authorization: Bearer your-api-key" \
-H "Content-Type: application/json" \
-H "X-End-User-Id: user-12345" \
-H "X-Session-Id: user-12345-conv-abc" \
-d '{
"model": "gemini-2.5-flash",
"messages": [
{"role": "user", "content": "Summarize this document"}
]
}'Enterprise Controls
Framework Quickstarts
React + Vite
SPA with streaming chat and per-user JWT. Token vending from your backend, streaming from the browser.
Next.js App Router
Route-handler mint + client streaming. Works with NextAuth, Clerk, Supabase, Auth0.
Vercel (Edge + AI SDK)
Edge Runtime token route and optional useChat proxy.
Supabase Edge Functions
Standalone Deno Edge Function for any frontend.
Lovable + Supabase
Lovable.dev app with Supabase auth + Edge Function token mint.
Modelence
Modelence module + built-in auth, one mint mutation.
Python FastAPI
FastAPI backend using the Python SDK.
Node + Express
Node service mint + streaming.
API Reference
Chat Completions
POST /v1/chat/completions — request body, response format, headers, and streaming.
Authentication
Bearer tokens, API key generation, and Argon2id hashing.
Error Reference
HTTP status codes, error response format, and troubleshooting.
Models & Routing
GET /v1/models — available models and smart routing.
Developer Reference
Authentication
API keys, per-user JWTs, token minting, JWKS, and dual-mode signing.
API Reference
OpenAI-compatible endpoints, Behest headers, streaming, session + thread routes.
Models
Available models, routing, and provider mapping.
Errors
Error codes, retry behavior, and rate-limit handling.
Rate Limiting & Budgets
Per-project, per-user, and Token FinOps budgets. Handling 429s gracefully.
BYOK
Bring your own provider keys — OpenAI, Anthropic, Gemini, Vertex.
TypeScript SDK
@behest/client-ts — token minting, session helpers, thread APIs.
Python SDK
behest-ai — async-first Python SDK, OpenAI-compatible.
Guardrails
Enterprise AI Governance: PII redaction, Sentinel, and safety controls.
Guides
CORS from Browsers
Call Behest directly from React, Next.js, or any frontend. No backend proxy required.
PII Protection
Automatically scrub sensitive data before it reaches any LLM. Configure modes, actions, and per-entity settings.
Streaming UI
Cancel, reconnect, typewriter effect. React, vanilla JS, and Python examples.
Migration from OpenAI
Switch from OpenAI direct to Behest in one line. Same SDK, new base URL.
Multi-Tenant Auth
Project keys, user tokens, and tenant isolation. Per-user rate limiting and token budgets.
Rate Limiting
5-layer rate limiting: project, user, IP, token budget, and aggregate. Graceful handling with headers.
SDKs & Tools
@behest/client-ts
behest-ai
OpenAPI Spec
Full OpenAPI 3.1 specification for the Behest API. Import into Postman, Insomnia, or generate your own client.