Changelog

What we shipped this week. Follow along as we build the AI backend you don't have to.

April 26, 2026

Usage Tiers go live

Custom per-project usage tiers are now generally available. Author tiers via REST API, with each tier overriding rate, spend, security, memory, and routing controls. The tier name is signed into every end-user JWT for chargeback and per-tier monetization.

REST API: GET / POST / PUT / DELETE /v1/projects/{id}/tiers and GET /tiers/{id}/resolved
Sparse override fields per tier: rpm_limit, tokens_per_day, tokens_per_month, pii_mode, pii_entities, sentinel_mode, sentinel_blocklist, memory_enabled, memory_window, retention_days, store_tool_calls, provider_model, system_prompt
Tier name signed into the end-user JWT (`tier` claim) and emitted as the X-Tier header on proxied requests
Up to 3 custom tiers per project; Pro+ plan required to author custom tiers
Tier sync to Redis for sub-millisecond lookup at the gateway
Tier-aware feature gating for BYOAK and advanced PII / Sentinel modes

April 13, 2026

BYOAK v2 + Stripe Billing

Bring Your Own API Keys (BYOAK v2) ships with 8 LLM providers, AES-256-GCM encryption at rest, and per-tenant routing. Self-serve Stripe billing with three platform plans (Free / Pro / Max) and token metering goes live alongside.

BYOAK v2: tenant-scoped keys for OpenAI, Anthropic, Gemini, Vertex AI, Bedrock, Mistral, Cohere, OpenRouter
AES-256-GCM encryption at rest; per-project routing through your own provider accounts
Stripe self-serve checkout, upgrade / downgrade / cancel; monthly or annual (15% discount)
Token-based metering: Free 5M / Pro 50M / Max 500M tokens per month
Tenant signing keys: per-tenant RSA key CRUD + JWKS, dashboard management UI

March 7, 2026

Public Beta Launch

Shipped the initial Behest AI platform with the core feature set for multi-tenant AI backend infrastructure.

Auth & tenant isolation with JWT support and API key management
Three-tier rate limiting (per-IP, per-project, per-user)
CORS-ready API with per-project origin configuration
PII Shield with mask, redact, and block modes (Presidio-powered)
Sentinel prompt injection defense with pattern detection and custom blocklists
Conversation memory with configurable session windows
Token budgets with per-user and per-project daily limits
Full observability stack (OpenTelemetry, Grafana, Loki, Tempo)
Self-hosted deployment via Helm charts on GKE

March 1, 2026

PII Shield & Sentinel

Added enterprise-grade security features to protect sensitive data and block prompt injection attacks.

PII Shield: three enforcement modes (disabled, shadow, enforce) with reversible masking
Sentinel: multiple detection patterns for common jailbreak techniques
Custom blocklist support per project
Shadow mode for monitoring without blocking — test before enforcing

February 20, 2026

Token Budgets

Per-user and per-project daily token budget enforcement to prevent runaway AI costs.

Configurable daily token budgets per user (default 1M) and per project (default 10M)
Pre-check at gateway with post-request reconciliation
Actual token counts from LLM provider responses for accurate tracking
Budget exceeded responses with clear error messages

February 10, 2026

CORS & Self-Hosted Deployment

Per-project CORS configuration and production-ready Helm chart deployment for Kubernetes.

Per-project CORS origin allowlists with preflight handling
Helm chart for GKE Autopilot deployment
Docker Compose for local development
Cloud SQL Auth Proxy integration for production databases