API Endpoints & Requests – ModelRiver Docs

Base configuration

Setting	Value
Base URL	`https://api.modelriver.com`
Primary endpoint	`POST /v1/ai`
Async endpoint	`POST /v1/ai/async`
Authentication	`Authorization: Bearer mr_live_...`
Content-Type	`application/json`

All requests require a valid project API key. Create and manage keys in your project settings.

Synchronous requests

Standard requests return the AI response immediately. Use this for real-time interactions like chatbots, search, or any use case where latency matters.

Endpoint: POST /v1/ai

Bash

curl -X POST https://api.modelriver.com/v1/ai \
  -H "Authorization: Bearer mr_live_your_key" \
  -H "Content-Type: application/json" \
  -d '{
    "workflow": "marketing-summary",
    "messages": [
      {"role": "user", "content": "Summarise this week's launch."}
    ],
    "metadata": {
      "audience": "enterprise customers"
    }
  }'

Python example

PYTHON

1import requests
2 
3response = requests.post(
4    "https://api.modelriver.com/v1/ai",
5    headers={
6        "Authorization": "Bearer mr_live_your_key",
7        "Content-Type": "application/json"
8    },
9    json={
10        "workflow": "marketing-summary",
11        "messages": [
12            {"role": "user", "content": "Summarise this week's launch."}
13        ]
14    }
15)
16 
17data = response.json()
18print(data["data"])

Node.js example

JAVASCRIPT

1const response = await fetch("https://api.modelriver.com/v1/ai", {
2  method: "POST",
3  headers: {
4    "Authorization": "Bearer mr_live_your_key",
5    "Content-Type": "application/json",
6  },
7  body: JSON.stringify({
8    workflow: "marketing-summary",
9    messages: [
10      { role: "user", content: "Summarise this week's launch." }
11    ],
12  }),
13});
14 
15const data = await response.json();
16console.log(data.data);

Asynchronous requests

For long-running tasks (batch processing, complex workflows, event-driven pipelines), use the async endpoint. It returns a job ID immediately, and you receive the result via WebSocket or webhook.

Endpoint: POST /v1/ai/async

Bash

curl -X POST https://api.modelriver.com/v1/ai/async \
  -H "Authorization: Bearer mr_live_your_key" \
  -H "Content-Type: application/json" \
  -d '{
    "workflow": "batch-processor",
    "messages": [
      {"role": "user", "content": "Process this large document..."}
    ]
  }'

Async response (immediate)

JSON

1{
2  "message": "success",
3  "status": "pending",
4  "channel_id": "550e8400-e29b-41d4-a716-446655440000",
5  "websocket_url": "ws://api.modelriver.com/socket",
6  "websocket_channel": "ai_response:550e8400-e29b-41d4-a716-446655440000"
7}

WebSocket flow

Connect to the WebSocket URL with your ws_token
Join the channel ai_response:{project_id}:{channel_id}
Listen for response events containing the final payload

Tip: Use the official Client SDK to handle WebSocket connections, reconnection, and progress tracking automatically.

Request fields

Field	Required	Description
`workflow`	✅	Name of a saved workflow. Overrides `provider` and `model`
`provider`	❌	Provider name (required when not using workflows)
`model`	❌	Model name (required when not using workflows)
`messages`	✅	Chat-style payload: array of `{role, content}` objects
`temperature`	❌	Sampling temperature (0–2), passed to provider
`max_tokens`	❌	Maximum tokens in response
`top_p`	❌	Nucleus sampling parameter
`stream`	❌	`true` for SSE streaming (see Streaming)
`format`	❌	`"raw"` (default) or `"wrapped"` (see Response formats)
`response_format`	❌	JSON schema for structured output (or use workflow's schema)
`structured_output_schema`	❌	Pass a JSON schema directly without creating a workflow
`tools`	❌	Array of tool definitions for function calling
`tool_choice`	❌	How the model should use tools (`auto`, `none`, etc.)
`inputs`	❌	Free-form fields your workflow can access
`metadata`	❌	Free-form metadata (echoed via cache fields)
`context`	❌	Additional context passed to the provider

Response payload

Synchronous response (wrapped format)

JSON

1{
2  "data": { "summary": "..." },
3  "customer_data": { "metadata.audience": "enterprise customers" },
4  "meta": {
5    "status": "success",
6    "http_status": 200,
7    "workflow": "marketing-summary",
8    "requested_provider": "openai",
9    "requested_model": "gpt-4o-mini",
10    "used_provider": "openai",
11    "used_model": "gpt-4o-mini",
12    "duration_ms": 1420,
13    "usage": {
14      "prompt_tokens": 123,
15      "completion_tokens": 45,
16      "total_tokens": 168
17    },
18    "structured_output": true,
19    "attempts": [
20      {"provider": "openai", "model": "gpt-4o-mini", "status": "success"}
21    ]
22  },
23  "backups": [
24    {"position": 1, "provider": "anthropic", "model": "claude-3-5-sonnet"}
25  ]
26}

Response fields explained

Field	Description
`data`	The AI-generated response content
`customer_data`	Cached fields echoed from your request metadata
`meta.status`	`"success"` or `"error"`
`meta.workflow`	The workflow that processed this request
`meta.used_provider`	The provider that actually served the response
`meta.duration_ms`	End-to-end request duration in milliseconds
`meta.usage`	Token consumption breakdown
`meta.attempts`	Array of all provider attempts (including failovers)
`backups`	Configured fallback providers for this workflow

Error responses

ModelRiver returns 200 with an error object when the request was valid but the provider failed. Transport/authentication problems return standard HTTP status codes (401, 403, 429, 5xx).

JSON

1{
2  "data": null,
3  "customer_data": {},
4  "error": {
5    "message": "Provider request failed",
6    "details": {"status": 504, "message": "Upstream timeout"}
7  },
8  "meta": {
9    "status": "error",
10    "http_status": 502,
11    "workflow": "marketing-summary",
12    "attempts": [
13      {"provider": "openai", "model": "gpt-4o-mini", "status": "error", "reason": "timeout"}
14    ]
15  }
16}

For complete error handling details, see Error handling.

Tips for production use

Use workflows so you can change providers and prompts without redeploying applications
Leverage cache fields to echo request metadata in responses: especially helpful for tracing user IDs or experiment variants
Handle backups in the response if you need to know which fallback succeeded
Respect rate limits: if you see 429, implement exponential backoff or reach out to increase limits
Store responses if you need historical context: ModelRiver retains Request Logs, but you can export them or stream them elsewhere

Next steps

OpenAI SDK compatibility: Use any OpenAI-compatible SDK
Streaming: Real-time token delivery
Error handling: Handling failures gracefully
Webhooks: Receive async results

API endpoints & request patterns

Base configuration

Synchronous requests

Python example

Node.js example

Asynchronous requests

Async response (immediate)

WebSocket flow

Request fields

Response payload

Synchronous response (wrapped format)

Response fields explained

Error responses

Tips for production use

Next steps