Skip to main content

    Multi-Conversation Chat: Sessions + Threads

    Behest gives you two complementary memory primitives. Use threads for persistent, user-visible conversations with full message history (think: ChatGPT sidebar). Use sessions for ephemeral in-memory context (think: a single checkout assistant that resets after).

    Most apps want threads. Keep reading.


    Quick mental model

    ThreadsSessions
    StoragePostgreSQL (behest-memory)Redis sorted set
    LifetimeForever (until DELETE)Until TTL / process restart
    Keyed bythread_id (client-chosen){pid}:{uid}:{sid}
    HeaderX-Thread-IdX-Session-Id
    APIGET/DELETE /v1/threads, GET /v1/threads/{id}/messagesNone — opaque to clients
    Per-user?Yes (rows joined on user_id)Yes (sid scoped by uid)
    Good forChat apps with history UIWizards, forms, one-shot assistants

    You can use both at once — a thread for persisted history + a session for in-memory scratchpad.


    Threads end-to-end

    All thread APIs require a Behest JWT, so they're called from the server with the v1.5 SDK (or from the browser with a minted JWT; the examples below use the server pattern, since most apps fetch threads alongside auth).

    1. Create (implicit on first message)

    Threads are created lazily. First call with a new X-Thread-Id creates the row. From the browser (after fetching a JWT from your /api/behest/token endpoint):

    ts
    import OpenAI from "openai";
    const { token, sessionId } = await fetchBehestToken();
    const openai = new OpenAI({
      apiKey: token,
      baseURL: `${BEHEST_BASE_URL}/v1`,
      dangerouslyAllowBrowser: true,
      defaultHeaders: {
        "X-Session-Id": sessionId,
        "X-Thread-Id": "thread_2026_04_12_abc",
      },
    });
     
    const stream = await openai.chat.completions.create({
      messages: [{ role: "user", content: "Plan a trip to Lisbon" }],
      stream: true,
    });

    Behest persists the user message and the assistant's streamed response keyed by (pid, uid, thread_id).

    2. Continue

    Next turn — just include the same X-Thread-Id. You do not need to resend prior messages; Behest loads them server-side:

    ts
    await openai.chat.completions.create({
      messages: [{ role: "user", content: "Add a day in Sintra" }],
      // X-Thread-Id already in defaultHeaders
    });

    If you do pass prior messages in messages, Behest will merge — so you can optimistically render the full list client-side without worrying about dupes.

    3. List (server-side, via SDK)

    ts
    // server: app/api/behest/threads/route.ts
    import { Behest } from "@behest/client-ts";
    const behest = new Behest();
     
    export async function GET() {
      const userId = await currentUserId(); // from your session
      await behest.auth.mint({ user_id: userId });
      const threads = await behest.threads.list();
      return Response.json(threads);
    }

    The SDK returns Thread[] directly — each has id plus any server-side fields (title, timestamps, etc.). Scope is automatic: the JWT's uid filters rows.

    4. Read messages (server-side, via SDK)

    ts
    const messages = await behest.threads.messages("thread_2026_04_12_abc");
    // ThreadMessage[]: [{ role, content, ... }]

    Returns an array directly. Use this to hydrate a chat UI when the user clicks an old conversation in the sidebar.

    5. Delete

    ts
    await behest.threads.delete("thread_2026_04_12_abc");

    Deletes the thread and all its messages. Resolves with no body (204).


    Sessions end-to-end

    Sessions are Redis-backed short-term memory. They expire based on PROJECT_SESSION_TTL (default 1 hour of inactivity).

    ⚠️ Session-hijacking caveat (until PLAN §7.2 ships): Kong currently does not validate that X-Session-Id is scoped to the caller's uid. Any authenticated user in the same project who knows another user's session id can read that session's ephemeral context. Mitigations:

    • Use unguessable UUIDs (crypto.randomUUID()) — never predictable ids like checkout_${userId}_${ts}.
    • Prefer mint-time session_id (below) so the browser can't set the header at all.
    • Do not base access decisions on X-Session-Id — treat it as a scoping hint, not an authenticated claim. Threads are not affected — they are scoped by uid server-side.
    ts
    // Override session id per tab / per flow. sessionId should be a UUID (unguessable).
    const checkoutSid = crypto.randomUUID();
     
    const openai = new OpenAI({
      apiKey: token,
      baseURL: `${BEHEST_BASE_URL}/v1`,
      dangerouslyAllowBrowser: true,
      defaultHeaders: { "X-Session-Id": checkoutSid },
    });
     
    await openai.chat.completions.create({
      messages: [{ role: "user", content: "I want to buy the blue one" }],
    });
     
    // Later, in the same session
    await openai.chat.completions.create({
      messages: [{ role: "user", content: "Ship it to the default address" }],
    }); // assistant remembers "blue one"

    No X-Session-Id header → all requests share a single "default" session per user. Fine for simple one-shot chatbots, bad for multi-tab apps.

    The v1.5 SDK's auth.mint({ session_id }) embeds the sid in the JWT. Your /api/behest/token endpoint returns sessionId in its response, and the browser pins it once:

    ts
    // Server (token route)
    const { token, sessionId } = await behest.auth.mint({ user_id: userId });
    // session_id was auto-generated; or pass your own: behest.auth.mint({ user_id, session_id: "..." })
    return Response.json({ token, sessionId, ... });
     
    // Browser
    const { token, sessionId } = await fetchBehestToken();
    const openai = new OpenAI({
      apiKey: token,
      baseURL: `${BEHEST_BASE_URL}/v1`,
      defaultHeaders: { "X-Session-Id": sessionId },
    });

    Kong sees the sid claim and injects X-Session-Id automatically when the header is absent, so even if a caller omits the header the session still binds correctly.

    Multi-tab apps should mint a fresh JWT per tab (each with its own session_id) to get independent contexts.


    Combining threads + sessions

    Common pattern: a persistent thread (full history) + an ephemeral session for RAG context or tool-call state:

    ts
    const openai = new OpenAI({
      apiKey: token,
      baseURL: `${BEHEST_BASE_URL}/v1`,
      dangerouslyAllowBrowser: true,
      defaultHeaders: {
        "X-Thread-Id": threadId, // persistent, user-visible
        "X-Session-Id": `${sessionIdFromMint}:tools`, // ephemeral scratchpad
      },
    });

    Example: ChatGPT-style sidebar

    tsx
    import { useEffect, useMemo, useState } from "react";
    import OpenAI from "openai";
     
    type Thread = { id: string; title?: string; last_message_at?: string };
    type Msg = { role: "user" | "assistant"; content: string };
     
    function ChatApp() {
      const [threads, setThreads] = useState<Thread[]>([]);
      const [activeId, setActiveId] = useState<string | null>(null);
      const [messages, setMessages] = useState<Msg[]>([]);
     
      // Server routes: /api/behest/threads (GET) and /api/behest/threads/:id/messages (GET)
      // both use the v1.5 SDK server-side (see section 3/4 above).
     
      useEffect(() => {
        fetch("/api/behest/threads")
          .then((r) => r.json())
          .then(setThreads);
      }, []);
     
      useEffect(() => {
        if (!activeId) return;
        fetch(`/api/behest/threads/${activeId}/messages`)
          .then((r) => r.json())
          .then(setMessages);
      }, [activeId]);
     
      async function send(text: string) {
        const threadId = activeId ?? crypto.randomUUID();
        if (!activeId) {
          setActiveId(threadId);
          setThreads((t) => [{ id: threadId, title: text.slice(0, 40) }, ...t]);
        }
        setMessages((m) => [
          ...m,
          { role: "user", content: text },
          { role: "assistant", content: "" },
        ]);
     
        const { token, sessionId } = await (
          await fetch("/api/behest/token", { method: "POST" })
        ).json();
        const openai = new OpenAI({
          apiKey: token,
          baseURL: `${import.meta.env.VITE_BEHEST_BASE_URL}/v1`,
          dangerouslyAllowBrowser: true,
          defaultHeaders: { "X-Session-Id": sessionId, "X-Thread-Id": threadId },
        });
     
        const stream = await openai.chat.completions.create({
          messages: [{ role: "user", content: text }],
          stream: true,
        });
        for await (const chunk of stream) {
          const delta = chunk.choices[0]?.delta?.content ?? "";
          setMessages((m) => {
            const copy = [...m];
            copy[copy.length - 1] = {
              ...copy[copy.length - 1],
              content: copy[copy.length - 1].content + delta,
            };
            return copy;
          });
        }
      }
     
      return (
        <div className="flex">
          <aside>
            <button
              onClick={() => {
                setActiveId(null);
                setMessages([]);
              }}
            >
              + New
            </button>
            {threads.map((t) => (
              <div key={t.id} onClick={() => setActiveId(t.id)}>
                {t.title ?? t.id}
              </div>
            ))}
          </aside>
          <main>
            {messages.map((m, i) => (
              <div key={i}>
                {m.role}: {m.content}
              </div>
            ))}
          </main>
        </div>
      );
    }

    See also

    Enterprise Token FinOps: Enforce hard budgets and attribute costs per session.

    Learn more