Documentation

Everything you need to build with Behest: the unified AI backend with Token FinOps and Enterprise AI Governance.

Quickstart

TypeScript / JavaScript

npm install openai

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://api.behest.ai/v1",
  apiKey: "your-api-key",
  defaultHeaders: {
    "X-End-User-Id": userId,
    // Uniquely identifies a conversation thread for per-session cost attribution.
    "X-Session-Id": `user-${userId}-conv-${conversationId}`,
  },
});

const completion = await client.chat.completions.create({
  model: "gemini-2.5-flash",
  messages: [{ role: "user", content: userMessage }],
});

console.log(completion.choices[0].message.content);

Python

Coming Soon

pip install openai

from openai import OpenAI

client = OpenAI(
    base_url="https://api.behest.ai/v1",
    api_key="your-api-key",
    default_headers={
        "X-End-User-Id": user_id,
        # Uniquely identifies a conversation thread for per-session cost attribution.
        "X-Session-Id": f"user-{user_id}-conv-{conversation_id}",
    },
)

completion = client.chat.completions.create(
    model="gemini-2.5-flash",
    messages=[{"role": "user", "content": user_message}],
)

print(completion.choices[0].message.content)

cURL

POST /v1/chat/completions

curl -X POST https://api.behest.ai/v1/chat/completions \
  -H "Authorization: Bearer your-api-key" \
  -H "Content-Type: application/json" \
  -H "X-End-User-Id: user-12345" \
  -H "X-Session-Id: user-12345-conv-abc" \
  -d '{
    "model": "gemini-2.5-flash",
    "messages": [
      {"role": "user", "content": "Summarize this document"}
    ]
  }'

Full quickstart guide

Enterprise Controls

Token FinOps

Configure per-session cost attribution, token budgets, and cost-center rollups.

AI Governance

Set up tenant model allowlists, PII Shield redaction, and Sentinel prompt-injection defense.

Framework Quickstarts

React + Vite

SPA with streaming chat and per-user JWT. Token vending from your backend, streaming from the browser.

Next.js App Router

Route-handler mint + client streaming. Works with NextAuth, Clerk, Supabase, Auth0.

Vercel (Edge + AI SDK)

Edge Runtime token route and optional useChat proxy.

Supabase Edge Functions

Standalone Deno Edge Function for any frontend.

Lovable + Supabase

Lovable.dev app with Supabase auth + Edge Function token mint.

Modelence

Modelence module + built-in auth, one mint mutation.

Python FastAPI

FastAPI backend using the Python SDK.

Node + Express

Node service mint + streaming.

API Reference

Chat Completions

POST /v1/chat/completions — request body, response format, headers, and streaming.

Authentication

Bearer tokens, API key generation, and Argon2id hashing.

Error Reference

HTTP status codes, error response format, and troubleshooting.

Models & Routing

GET /v1/models — available models and smart routing.

Developer Reference

Authentication

API keys, per-user JWTs, token minting, JWKS, and dual-mode signing.

API Reference

OpenAI-compatible endpoints, Behest headers, streaming, session + thread routes.

Models

Available models, routing, and provider mapping.

Errors

Error codes, retry behavior, and rate-limit handling.

Rate Limiting & Budgets

Per-project, per-user, and Token FinOps budgets. Handling 429s gracefully.

BYOK

Bring your own provider keys — OpenAI, Anthropic, Gemini, Vertex.

TypeScript SDK

@behest/client-ts — token minting, session helpers, thread APIs.

Python SDK

behest-ai — async-first Python SDK, OpenAI-compatible.

Guardrails

Enterprise AI Governance: PII redaction, Sentinel, and safety controls.

Guides

CORS from Browsers

Call Behest directly from React, Next.js, or any frontend. No backend proxy required.

PII Protection

Automatically scrub sensitive data before it reaches any LLM. Configure modes, actions, and per-entity settings.

Streaming UI

Cancel, reconnect, typewriter effect. React, vanilla JS, and Python examples.

Migration from OpenAI

Switch from OpenAI direct to Behest in one line. Same SDK, new base URL.

Multi-Tenant Auth

Project keys, user tokens, and tenant isolation. Per-user rate limiting and token budgets.

Rate Limiting

5-layer rate limiting: project, user, IP, token budget, and aggregate. Graceful handling with headers.

SDKs & Tools

@behest/client-ts

TypeScript/JavaScript SDK for Behest. Works in Node.js, Deno, Bun, and browsers.

npm GitHub

behest-ai

Python SDK for Behest. Async-first, type-safe, and compatible with the OpenAI Python SDK.

PyPI Docs

OpenAPI Spec

Full OpenAPI 3.1 specification for the Behest API. Import into Postman, Insomnia, or generate your own client.

API Reference

Community

GitHub

Star @behest/client-ts, report issues, and contribute to the SDK.