Real-time

Streaming responses over WebSockets

Show responses as they generate for a snappy, real-time experience. No waiting for the full response.

WebSocket channels Live status events Token-by-token Logged completion

Visual

Stream journey

WebSocket session emits partials, tool calls, and the final message.

Tokens · partials · close

Client

WebSocket channel

ModelRiver stream

backpressure-aware

Tokens

streamed partials

Tool calls

functions + args

Final message

completion + metrics

Logs + close

request log + channel end

Channel example
channel_id: "ws_9f2b..."
websocket_url: "wss://api.modelriver.com/ws"
status: "pending"
stream: true
events:
  - type: "tokens"
    data: "Once upon..."
  - type: "final"
    latency_ms: 1240
              
1

Instant UX

Render partial text while the model runs so users feel progress immediately.

2

Single source of truth

Streaming and finals are tracked in request logs with tokens and timing.

3

Fallback aware

If a provider fails mid-stream, failover continues the response on a healthy model.

UX speed

<200ms

Typical time to first token after connect.

Event types

tokens · status

Progress you can pipe straight to UI.

Observability

Logs on finish

Final response and metrics captured.

Scroll the stream

01 · Connect

Open the WebSocket with channel_id and auth.

02 · Receive

Render tokens and status events as they arrive.

03 · Finish

Capture final payload and metrics in logs.

04 · Recover

If a provider fails, failover continues the stream.

Use cases

  • Chat UIs that need token-by-token updates.
  • Dashboards that monitor long-running tasks.
  • Any flow where perceived latency matters.

What’s unique

  • Same channel_id also appears in request logs.
  • Plays nicely with structured outputs and webhooks.
  • Built-in failover if a stream errors mid-flight.

Delight users in real time

Pair streaming with structured outputs and webhooks for reliable, verifiable completions.