AI Observability & LLM Monitoring Guide

What are Request Logs?

Request Logs are ModelRiver's comprehensive audit trail for every AI request made through your API. Each log entry captures the complete lifecycle of a request, from initial provider attempts through successful responses or failures, including webhook deliveries and backend callbacks.

Why Request Logs exist

Request Logs serve multiple critical purposes:

Debugging production issues – Inspect exact request/response payloads to identify why a request failed or returned unexpected results
Cost analysis and optimization – Track token usage and pricing per request to understand spending patterns and optimize model selection
Performance monitoring – Monitor request latency, identify slow providers, and track performance trends over time
Provider reliability tracking – See which providers fail most often, how often fallbacks trigger, and make data-driven decisions about provider selection
Billing reconciliation – Match API usage to invoices and verify pricing accuracy
Compliance and audit trails – Maintain complete records of AI interactions for regulatory compliance or internal audits
Testing and validation – Review test mode and playground requests separately from production to validate workflows before deployment

Where to find Request Logs

Navigate to Request Logs in your project console using the clipboard sheet icon in the sidebar. The logs view shows all requests for your active project, with powerful filtering options to focus on specific request types.

Request Log List View

The Request Logs list provides a high-level overview of all requests in your project, with essential information visible at a glance.

Table columns

Column	Description	Why it matters
Provider	AI provider used (OpenAI, Anthropic, etc.) with provider icon	Quick visual identification of which vendor handled the request
Model	Specific model used (e.g., `gpt-4o`, `claude-3-5-sonnet`)	Track model usage patterns and performance differences
Input tokens	Prompt tokens consumed	Cost calculation (input tokens typically cost less than output tokens)
Output tokens	Completion tokens generated	Cost calculation and understanding response size
Duration	Request latency in milliseconds	Performance monitoring and SLA tracking
Status	Success (green) or Error (red) with visual indicator	Immediate health check—identify failing requests instantly
Time	Relative time (e.g., "5m ago", "2h ago")	Context for when issues occurred

Filtering options

The filter dropdown lets you focus on specific request types, essential for separating production traffic from testing and debugging:

All requests – Every request in the project, regardless of source
- Use when: You want a complete overview or are searching broadly
Live mode – Production API requests
- Use when: Debugging production issues, analyzing real user traffic, or monitoring live system health
- Why: These are actual API calls from your applications, not test runs
Test mode – Workflow test mode requests
- Use when: Reviewing integration tests, CI/CD pipeline runs, or development environment requests
- Why: Test mode requests use sample data from structured outputs, so they don't consume provider credits but still create logs for validation
Playground (Production) – Console testing with production workflows
- Use when: Reviewing ad-hoc tests you ran in the console using production workflows
- Why: These are real provider calls made from the console, useful for validating workflow changes before deploying
Playground (Test mode) – Console testing with test mode workflows
- Use when: Reviewing console tests that used test mode workflows
- Why: These don't consume credits but help validate workflow logic
All Playground – Both playground types combined
- Use when: Reviewing all console testing activity regardless of workflow mode

Why filtering matters: Separating production from testing prevents test noise from obscuring real issues.

Additional features

Refresh button – Manually reload the latest logs without navigating away
- Why: Useful when monitoring real-time issues or after making API calls
Pagination and "See more" – Load additional pages of logs incrementally
- Why: Efficiently browse large log volumes without loading everything at once
Status indicators – Color-coded badges (green for success, red for error)
- Why: Instant visual feedback on request health
Provider icons – Visual provider logos next to provider names
- Why: Faster recognition of which vendors are being used
Failed models badge – Shows count of failed provider attempts before success
- Why: Quickly identify requests that required fallbacks, indicating provider instability
Event badge – Indicates event-driven workflow requests
- Why: Distinguish async event-driven requests from standard sync requests
Click to view details – Click any row to open the detailed log view
- Why: Access complete request lifecycle information

Request Log Detail View

Click any log entry to open the detail view, which provides comprehensive information about a single request's complete lifecycle.

Accessing the detail view

Navigate to Request Logs in your project
Click any row in the logs table
The detail view opens, showing the timeline and detailed information

Timeline view (always visible)

The timeline is the centerpiece of the detail view, providing a visual representation of the complete request lifecycle. It's always visible at the top of the detail page, even when viewing specific timeline item details.

Why the timeline exists: The timeline shows the complete story of a request—not just the final result, but every attempt, webhook delivery, and callback. This is essential for understanding:

Why a request took longer than expected (multiple provider attempts)
Whether webhooks were delivered successfully
If backend callbacks completed for event-driven workflows
The complete failover chain when providers fail

For detailed documentation on each timeline component, see the Timeline documentation.

Timeline components

The timeline displays items in chronological order, showing the complete request journey:

Failover Attempts – Failed provider/model attempts that occurred before the successful request
Main Request – The successful or final request that returned the response to your application
Webhook Deliveries – Webhook notifications sent for async requests or event-driven workflows
Backend Callbacks – Callback responses from your backend for event-driven workflows

Why the timeline order matters: The chronological order tells the complete story. Failover attempts show resilience, the main request shows the final result, webhook deliveries show async notifications, and callbacks show backend processing. This sequence helps you understand the full request lifecycle.

Detail View Features

When you click a timeline item, the detail view expands to show comprehensive information about that specific component.

Provider Request Details

When viewing a provider request (main request or failover attempt), you see:

Header information

Provider icon and name – Visual identification of the AI provider
Model name – Specific model used (e.g., gpt-4o-mini, claude-3-5-sonnet-20241022)
Status – Success, Failed, or Error with color-coded badge
Duration – Request latency in milliseconds
Timestamp – When the request occurred (relative time, e.g., "5m ago")

Request Body tab

The Request Body tab shows exactly what was sent to the AI provider.

Raw JSON view – Complete JSON payload in formatted, syntax-highlighted code editor
- Why: See the exact structure and content sent to the provider
- Use when: Debugging why a request failed, verifying prompt content, or understanding request formatting
Preview (tree view) – Interactive JSON tree viewer for easier navigation
- Why: Explore large payloads more easily, collapse/expand sections, and focus on specific fields
- Use when: Inspecting complex nested structures or large message arrays
Copy functionality – One-click copy of the entire request body
- Why: Quickly share request details with team members or use in API testing tools

Why request body inspection matters: Understanding what was sent helps you:

Debug prompt engineering issues
Verify workflow configuration is correct
Identify data quality problems
Reproduce issues in testing environments

Response Body tab

The Response Body tab shows the complete response from the AI provider.

Raw JSON view – Full provider response in formatted code editor
- Why: See exactly what the provider returned, including all metadata
- Use when: Analyzing response quality, debugging parsing issues, or verifying structured output compliance
Preview (tree view) – Interactive JSON tree for response exploration
- Why: Navigate large responses, focus on specific fields, and understand response structure
- Use when: Inspecting complex structured outputs or large text completions
Copy functionality – Copy the entire response for analysis or sharing

Why response body inspection matters: The response body shows:

The actual AI-generated content
Token usage breakdown
Provider-specific metadata
Structured output compliance
Error messages (if the request failed)

Webhook Delivery Details

When viewing a webhook delivery, you see comprehensive delivery information:

Status information

Status badges:
- Planned – Webhook is queued, waiting to be sent
- Delivering – Webhook HTTP request is in progress
- Success – Webhook was delivered successfully (2xx response)
- Error – Webhook delivery failed (non-2xx response, timeout, or network error)
Callback status (for event-driven workflows) – Shows if your backend callback was received:
- Progress – Callback is expected but not yet received
- Success – Callback was received successfully
- Error – Callback indicated an error or wasn't received within timeout
Retry button (when applicable) – Manually retry a failed webhook delivery
- Available when: can_retry is true and the delivery failed
- Limitations: See Webhook Retry Logic for detailed rules

Delivery metadata

Webhook URL – The endpoint that received (or should receive) the webhook
Delivery time – When the webhook was sent (relative time)
Duration – How long the HTTP request took (if available)
HTTP status code – Response code from your endpoint (if available)
Error message – Detailed error if delivery failed

Request Data tab (webhook payload)

Shows the exact payload sent to your webhook endpoint.

Raw JSON view – Complete webhook payload
Preview (tree view) – Interactive JSON tree
Copy functionality – Copy payload for testing or debugging

Why webhook payload inspection matters: Understanding the payload helps you:

Verify your webhook handler receives expected data
Debug webhook processing issues
Test webhook endpoints with real payloads
Understand event-driven workflow data structure

Response tab (webhook response)

Shows the response from your webhook endpoint.

Raw view – Complete HTTP response body
Preview (tree view) – If response is JSON
Copy functionality – Copy response for analysis

Why webhook response inspection matters: Your endpoint's response shows:

Whether your handler processed the webhook correctly
Any errors or validation issues
Response timing (affects delivery duration)

Backend Callback Details

For event-driven workflows, callback details show:

Callback payload – The webhook payload your backend received (same as Request Data tab for the webhook)
Callback response – The data your backend sent back to ModelRiver via the callback URL
Status – Success (callback received) or Timeout (callback not received within 5 minutes)
Simulated indicator – For playground tests, shows that the callback was simulated

Why callback details matter: Understanding callbacks helps you:

Verify your backend processed the AI response correctly
Debug callback failures or timeouts
Understand the data flow in event-driven workflows

Key Fields Explained

Request Logs capture extensive metadata about each request. Understanding these fields helps you make the most of the logging system.

Provider & Model

What it is: The AI provider (OpenAI, Anthropic, Google, etc.) and specific model (e.g., gpt-4o, claude-3-5-sonnet-20241022) used for the request.

Why it exists:

Cost attribution – Different models have different pricing; track which models drive costs
Performance comparison – Compare latency and quality across providers/models
Usage patterns – Understand which providers/models your application uses most
Provider reliability – Identify which providers fail most often

How to use it: Filter logs by provider to analyze vendor-specific issues, or compare model performance by reviewing duration and success rates.

Token Usage

What it is: Three metrics tracking AI consumption:

Prompt tokens (input tokens) – Tokens in the request sent to the provider
Completion tokens (output tokens) – Tokens in the response generated by the provider
Total tokens – Sum of prompt and completion tokens

Why it exists:

Cost calculation – Most providers charge per token; input and output tokens often have different rates
Usage monitoring – Track token consumption trends to predict costs
Optimization – Identify requests with unexpectedly high token usage
Billing reconciliation – Match token counts to provider invoices

How to use it: Review token usage to identify expensive requests, optimize prompts to reduce input tokens, or set usage alerts based on token thresholds.

Duration

What it is: Request latency in milliseconds, measured from when the request is sent to the provider until the response is received.

Why it exists:

Performance monitoring – Track request speed and identify slow requests
SLA tracking – Ensure requests complete within acceptable timeframes
Provider comparison – Compare latency across different providers/models
Optimization – Identify performance bottlenecks and optimize workflows

How to use it: Monitor duration trends to spot performance degradation, compare providers to choose faster options, or set alerts for requests exceeding thresholds.

Status

What it is: The final state of the request: success (completed successfully) or error (failed).

Why it exists:

Health monitoring – Quickly identify failing requests
Error tracking – Understand failure rates and patterns
Alerting – Trigger alerts when error rates exceed thresholds
Debugging – Filter to errors to focus on issues

How to use it: Use status filters to focus on errors, monitor success rates over time, or set up alerts for error spikes.

Primary Request ID

What it is: A UUID linking failed provider attempts to the successful request that ultimately served the response.

Why it exists:

Failover tracking – See the complete chain of provider attempts for a single logical request
Resilience analysis – Understand how often fallbacks trigger and which providers fail
Debugging – Trace why a request required multiple attempts
Cost attribution – Understand the true cost of a request including failed attempts

How to use it: When viewing a failed model attempt, the primary request ID links to the successful request. This helps you understand the complete failover chain and identify unreliable providers.

Event Name

What it is: A custom identifier for event-driven workflows. Set when creating a workflow with an event name.

Why it exists:

Event-driven workflow identification – Distinguish event-driven requests from standard requests
Filtering – Filter logs to specific event types
Workflow tracking – Track usage of specific event-driven workflows
Debugging – Isolate issues to specific event types

How to use it: Filter logs by event name to analyze specific event-driven workflows, or use event names to track workflow usage patterns.

Channel ID

What it is: A unique identifier for async requests, linking the initial request, webhook deliveries, and callbacks together.

Why it exists:

Async request tracking – Link all components of an async request lifecycle
Webhook correlation – Match webhook deliveries to their originating requests
Callback correlation – Link backend callbacks to their webhook deliveries
Debugging – Trace the complete async request flow

How to use it: Use channel ID to find all related logs for an async request, including webhook deliveries and callbacks. This is essential for debugging event-driven workflows.

Is Async

What it is: A boolean flag indicating whether the request was synchronous (false) or asynchronous (true).

Why it exists:

Request type distinction – Understand which requests are sync vs. async
Logging pattern differences – Sync and async requests may have different logging behaviors
Analytics – Analyze sync vs. async usage patterns
Debugging – Understand request flow based on type

How to use it: Filter or analyze logs based on sync vs. async to understand usage patterns or debug type-specific issues.

Price

What it is: The estimated cost per request in your account currency, calculated based on token usage and provider pricing.

Note: The price shown is an estimated cost based on the tokens sent and received. For accurate cost information, refer to the dashboards of your specific AI providers.

Why it exists:

Cost tracking – Understand the estimated cost of each request
Billing reconciliation – Match logs to invoices
Cost optimization – Identify expensive requests and optimize model selection
Budget management – Track spending and set cost alerts

How to use it: Review price fields to identify expensive requests, compare costs across providers/models, or calculate total spending from logs.

Model ID

What it is: A UUID reference to the Model record in your account, linking the request to model metadata and pricing information.

Why it exists:

Model metadata linking – Connect requests to model definitions and pricing
Analytics – Analyze usage by model record
Pricing accuracy – Verify pricing calculations against model definitions

How to use it: Use model ID to link requests to model records for detailed analytics or pricing verification.

Webhook Deliveries

For async requests and event-driven workflows, Request Logs track webhook delivery attempts, providing complete visibility into notification delivery.

Purpose

Webhook deliveries show how ModelRiver notified your backend about completed async requests. This is essential for:

Debugging delivery issues – Understand why webhooks failed to reach your endpoint
Monitoring delivery reliability – Track success rates and identify problematic endpoints
Retry management – Manually retry failed deliveries when appropriate
Audit trails – Maintain records of all delivery attempts for compliance

Status Lifecycle

Webhook deliveries progress through these states:

Planned – Webhook is queued and scheduled for delivery
- Why: Indicates the webhook is ready but not yet sent
- Visual: Gray status badge
Delivering – HTTP request to your endpoint is in progress
- Why: Shows active delivery attempt
- Visual: Blue status badge
Success – Your endpoint returned a 2xx status code
- Why: Confirms successful delivery
- Visual: Green status badge
Error – Delivery failed (non-2xx response, timeout, or network error)
- Why: Indicates delivery problem requiring attention
- Visual: Red status badge

Why status tracking matters: Understanding delivery status helps you:

Identify endpoints that consistently fail
Monitor delivery reliability
Debug webhook processing issues
Ensure critical notifications are received

Retry Functionality

ModelRiver automatically retries failed webhook deliveries, but you can also manually retry from the Request Logs detail view.

When retry is available

The retry button appears when:

can_retry is true (calculated by the backend)
The delivery failed (success is false)
At least 30 seconds have passed since the delivery was sent

Retry limits

Maximum 3 attempts total – Original delivery plus 2 retries
5-minute window – Retries are only allowed within 5 minutes of the first delivery attempt
30-second minimum delay – You must wait at least 30 seconds after a delivery before retrying it
No duplicate retries – Once a delivery has been retried, it cannot be retried again

Retry rules explained

Why 3 attempts maximum: Prevents infinite retry loops and ensures failed deliveries don't consume excessive resources. Three attempts provide reasonable coverage for transient failures.

Why 5-minute window: Webhook deliveries are time-sensitive. Retrying after 5 minutes may deliver stale data to your backend. The window ensures retries happen while the data is still relevant.

Why 30-second minimum delay: Prevents rapid retry storms that could overwhelm your endpoint. The delay gives your endpoint time to recover from transient issues.

Why no duplicate retries: Once a delivery has been retried, a new delivery record is created. Retrying the original would create confusion and duplicate notifications.

How to retry

Navigate to the Request Log detail view
Click the webhook delivery in the timeline
If retry is available, click the Retry button
A new delivery attempt is created and queued

When to retry manually: Use manual retry when:

Automatic retries have been exhausted
You've fixed an issue with your endpoint
You want to test webhook delivery immediately
You're within the retry window but automatic retries haven't triggered yet

Callback Status

For event-driven workflows, webhook deliveries also show callback status, indicating whether your backend called back to ModelRiver.

Callback status values:

Progress – Callback is expected but not yet received
Success – Callback was received successfully
Error – Callback indicated an error or wasn't received within the 5-minute timeout

Why callback status matters: Event-driven workflows require your backend to call back to ModelRiver with processed data. The callback status shows:

Whether your backend processed the AI response
If the callback completed successfully
Whether the callback timed out

Use Cases & Scenarios

Request Logs support a wide range of use cases. Each scenario has its own dedicated guide with step-by-step instructions, real examples, and actionable advice.

Scenario	Guide
User reports AI isn't working	Debugging Production Issues
Requests are failing	Troubleshooting Failures
Validate before deploying	Testing Workflows
AI spending is too high	Cost Analysis
Responses feel slow	Performance Monitoring
Provider outages	Provider Reliability
Backend not receiving webhooks	Webhook Monitoring

→ View all use cases

Best Practices

Follow these guidelines to get the most value from Request Logs.

Practice	Key benefit
Regular log review	Catch issues proactively
Using filters effectively	Cut through noise
Monitoring failed models	Provider stability
Tracking webhook reliability	Backend reliability
Separating environments	Clean data
Token usage optimization	Cost savings
Duration & performance	Speed optimization
Request & response inspection	Effective debugging
Using timeline context	Complete understanding

→ View all best practices

Next Steps

Explore the Dashboard to see Request Logs in action
Learn about Workflows to understand how requests are configured
Review Webhooks for async request patterns
Check API Integration for request/response formats
See Troubleshooting for common issues and solutions
Deep dive into specific topics:

AI Observability & Monitoring

What are Request Logs?

Why Request Logs exist

Where to find Request Logs

Request Log List View

Table columns

Filtering options

Additional features

Request Log Detail View

Accessing the detail view

Timeline view (always visible)

Timeline components

Detail View Features

Provider Request Details

Header information

Request Body tab

Response Body tab

Webhook Delivery Details

Status information

Delivery metadata

Request Data tab (webhook payload)

Response tab (webhook response)

Backend Callback Details

Key Fields Explained

Provider & Model

Token Usage

Duration

Status

Primary Request ID

Event Name

Channel ID

Is Async

Price

Model ID

Webhook Deliveries

Purpose

Status Lifecycle

Retry Functionality

When retry is available

Retry limits

Retry rules explained

How to retry

Callback Status

Use Cases & Scenarios

Best Practices

Next Steps