Response Formats
ModelRiver supports two response formats to accommodate different use cases. By default, responses are returned in raw format for maximum compatibility with existing OpenAI/Anthropic clients.
Response Format Options
1. Raw Format (Default)
Returns the exact provider response, maintaining full compatibility with OpenAI/Anthropic SDKs.
Request:
{
"workflow": "my-chat-workflow",
"messages": [
{"role": "user", "content": "Hello!"}
]
}
Response:
{
"id": "chatcmpl-123",
"object": "chat.completion",
"created": 1677652288,
"model": "gpt-4o",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! How can I help you today?"
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 9,
"completion_tokens": 12,
"total_tokens": 21
}
}
Metadata in headers:
X-ModelRiver-Provider: The provider used (e.g.,openai,anthropic)X-ModelRiver-Model: The model used (e.g.,gpt-4o)X-ModelRiver-Workflow: The workflow nameX-ModelRiver-Duration-Ms: Request duration in millisecondsX-ModelRiver-Attempts: Number of provider attempts (includes fallbacks)X-ModelRiver-Cached-Data: Cached fields (JSON string)X-ModelRiver-Offline-Fallback: Set to"true"if offline fallback was used
2. Wrapped Format
Returns the provider response wrapped with additional ModelRiver metadata, logs, and cached data.
Request:
{
"workflow": "my-chat-workflow",
"format": "wrapped",
"messages": [
{"role": "user", "content": "Hello!"}
]
}
Response:
{
"data": {
"id": "chatcmpl-123",
"object": "chat.completion",
"created": 1677652288,
"model": "gpt-4o",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! How can I help you today?"
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 9,
"completion_tokens": 12,
"total_tokens": 21
}
},
"customer_data": {
"user_id": "550e8400-e29b-41d4-a716-446655440000"
},
"meta": {
"status": "success",
"http_status": 200,
"workflow": "my-chat-workflow",
"requested_provider": "openai",
"requested_model": "gpt-4o",
"used_provider": "openai",
"used_model": "gpt-4o",
"duration_ms": 1250,
"usage": {
"prompt_tokens": 9,
"completion_tokens": 12,
"total_tokens": 21
},
"structured_output": false,
"attempts": [
{
"provider": "openai",
"model": "gpt-4o",
"status": "success"
}
],
"customer_fields": ["user_id"]
},
"logs": {
"columns": [...],
"rows": [...]
},
"backups": [
{
"position": 1,
"provider": "anthropic",
"model": "claude-3-5-sonnet"
}
]
}
Choosing the Right Format
Use Raw Format when:
✅ Drop-in replacement: You want to replace OpenAI/Anthropic API calls without changing client code ✅ SDK compatibility: You're using existing OpenAI/Anthropic SDKs ✅ Minimal overhead: You want the fastest response with minimal data transfer ✅ Standard integration: Your application expects standard OpenAI/Anthropic response format
Example use cases:
- Migrating from OpenAI to ModelRiver
- Using with LangChain, LlamaIndex, or other LLM frameworks
- Mobile apps with bandwidth constraints
- High-throughput production systems
Use Wrapped Format when:
✅ Debugging: You need detailed request logs and metadata ✅ Monitoring: You want to track provider fallbacks and attempts ✅ Analytics: You need cached data for analysis ✅ Internal tools: Building dashboards or admin interfaces ✅ Audit trails: Compliance requirements need detailed logging
Example use cases:
- Development and testing
- Admin dashboards
- Analytics pipelines
- Debugging production issues
Usage Examples
JavaScript (Raw Format - Default)
const response = await fetch('https://api.modelriver.com/api/v1/ai', {
method: 'POST',
headers: {
'Authorization': 'Bearer YOUR_API_KEY',
'Content-Type': 'application/json'
},
body: JSON.stringify({
workflow: 'my-chat-workflow',
messages: [
{ role: 'user', content: 'Hello!' }
]
})
});
const data = await response.json();
// data is exactly like OpenAI's response
console.log(data.choices[0].message.content);
// Check headers for metadata
const provider = response.headers.get('X-ModelRiver-Provider');
const duration = response.headers.get('X-ModelRiver-Duration-Ms');
console.log(`Completed in ${duration}ms using ${provider}`);
Python (Raw Format with OpenAI SDK)
import openai
# Point OpenAI SDK to ModelRiver
openai.api_base = "https://api.modelriver.com/api/v1"
openai.api_key = "YOUR_API_KEY"
# Use exactly like OpenAI
response = openai.ChatCompletion.create(
model="workflow:my-chat-workflow", # Prefix with "workflow:"
messages=[
{"role": "user", "content": "Hello!"}
]
)
print(response.choices[0].message.content)
cURL (Wrapped Format)
curl -X POST https://api.modelriver.com/api/v1/ai \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"workflow": "my-chat-workflow",
"format": "wrapped",
"messages": [
{"role": "user", "content": "Hello!"}
]
}'
Response:
{
"data": { /* OpenAI response */ },
"meta": { /* ModelRiver metadata */ },
"customer_data": { /* Cached fields */ },
"logs": { /* Request logs */ },
"backups": [ /* Configured fallbacks */ ]
}
TypeScript (Type Definitions)
// Raw Format (OpenAI-compatible)
interface RawResponse {
id: string;
object: string;
created: number;
model: string;
choices: Array<{
index: number;
message: {
role: string;
content: string;
};
finish_reason: string;
}>;
usage: {
prompt_tokens: number;
completion_tokens: number;
total_tokens: number;
};
}
// Wrapped Format
interface WrappedResponse {
data: RawResponse;
customer_data: Record<string, any>;
meta: {
status: string;
http_status: number;
workflow: string;
requested_provider: string;
requested_model: string;
used_provider: string;
used_model: string;
duration_ms: number;
usage: {
prompt_tokens: number;
completion_tokens: number;
total_tokens: number;
};
structured_output: boolean;
attempts: Array<{
provider: string;
model: string;
status: string;
reason?: any;
}>;
customer_fields: string[];
};
logs: {
columns: Array<{key: string; label: string}>;
rows: any[];
};
backups: Array<{
position: number;
provider: string;
model: string;
}>;
}
Error Handling
Errors are always returned in wrapped format, regardless of the format parameter, since there's no standard provider error format.
Error Response:
{
"data": null,
"error": {
"message": "Provider request failed",
"details": { /* error details */ },
"attempts": [ /* all attempts */ ]
},
"meta": {
"status": "error",
"http_status": 502,
"workflow": "my-chat-workflow",
"duration_ms": 2500,
...
},
"customer_data": {},
"logs": { ... },
"backups": [ ... ]
}
Migration Guide
Migrating from OpenAI
Before (OpenAI):
const response = await openai.chat.completions.create({
model: "gpt-4o",
messages: [{role: "user", content: "Hello"}]
});
After (ModelRiver - Raw Format):
const response = await fetch('https://api.modelriver.com/api/v1/ai', {
method: 'POST',
headers: {
'Authorization': 'Bearer YOUR_MODELRIVER_API_KEY',
'Content-Type': 'application/json'
},
body: JSON.stringify({
workflow: 'my-gpt4-workflow', // Configure in dashboard
messages: [{role: 'user', content: 'Hello'}]
})
});
const data = await response.json();
// data structure is identical to OpenAI's response
Migrating from Anthropic
Before (Anthropic):
const response = await anthropic.messages.create({
model: "claude-3-5-sonnet-20241022",
max_tokens: 1024,
messages: [{role: "user", content: "Hello"}]
});
After (ModelRiver - Raw Format):
const response = await fetch('https://api.modelriver.com/api/v1/ai', {
method: 'POST',
headers: {
'Authorization': 'Bearer YOUR_MODELRIVER_API_KEY',
'Content-Type': 'application/json'
},
body: JSON.stringify({
workflow: 'my-claude-workflow', // Configure in dashboard
max_tokens: 1024,
messages: [{role: 'user', content: 'Hello'}]
})
});
const data = await response.json();
// data structure is identical to Anthropic's response
Best Practices
- Default to raw format for production applications
- Use wrapped format during development and debugging
- Check response headers for metadata without payload overhead
- Cache workflow names to avoid repeated dashboard lookups
- Monitor X-ModelRiver-Attempts to detect fallback patterns
- Log X-ModelRiver-Offline-Fallback to identify connectivity issues
Summary
| Aspect | Raw Format | Wrapped Format | |--------|------------|----------------| | Default | ✅ Yes | ❌ No | | Provider compatibility | ✅ Full | ❌ Custom | | Metadata location | Headers | Response body | | Response size | Smaller | Larger | | Use case | Production | Debugging/Analytics | | SDK support | ✅ Yes | ❌ No |
For questions or feature requests, please open an issue on GitHub.