Regular Log Review – ModelRiver Docs

Overview

The most effective teams don't wait for user reports to discover issues — they proactively review logs to catch anomalies early. A 5-10 minute daily review can prevent hours of firefighting.

Daily review checklist

Morning check (5 minutes)

Run through this quick check each morning:

Open Request Logs → Filter to Live mode
Scan for red badges — Any error status requests since yesterday?
Check failed models count — Are failover rates higher than usual?
Review duration column — Any requests significantly slower than baseline?
Check webhook deliveries — Any failed deliveries that need attention?

What to look for

┌─────────────────────────────────────────────────────────┐
│  Morning Review Dashboard (mental model)                │
│                                                         │
│  Error rate:     ●  2.1%  (normal: < 3%)     ✓ OK     │
│  Failover rate:  ●  4.2%  (normal: < 5%)     ✓ OK     │
│  P95 latency:    ● 3.2s   (normal: < 4s)     ✓ OK     │
│  Webhook errors: ●  0     (normal: < 2)      ✓ OK     │
│                                                         │
│  Status: All clear ✓                                    │
└─────────────────────────────────────────────────────────┘

If anything is outside your normal range, investigate immediately.

Weekly deep dive (30 minutes)

Analyze trends

Once a week, look at broader patterns:

Compare error rates week-over-week — Is reliability improving or degrading?
Review token usage trends — Are average tokens per request growing? (This increases costs)
Check provider performance — Has any provider's latency or failure rate changed?
Review failover chains — Are specific providers consistently failing?

Example weekly report

Week of Feb 3-9, 2026:
  Total requests:     8,420
  Success rate:       97.8% (last week: 98.1%)
  Avg latency:        1,340ms (last week: 1,280ms)
  Failover rate:      3.2% (last week: 2.8%)
  Top failing model:  gpt-4o-mini (12 failures, up from 6)
  Webhook success:    99.4% (last week: 99.6%)
  Estimated cost:     $42.30 (last week: $38.50)
 
  ⚠ Action items:
  - Investigate gpt-4o-mini failure increase
  - Review cost increase (+10%)

Setting up alerts

Complement manual reviews with automated alerts:

Recommended alert thresholds

Metric	Warning threshold	Critical threshold
Error rate	> 3% over 1 hour	> 10% over 15 min
Failover rate	> 5% over 1 hour	> 15% over 15 min
P95 latency	> 2x baseline	> 3x baseline
Webhook failures	> 2 in 1 hour	> 5 in 15 min
Callback timeouts	> 1 in 1 hour	> 3 in 15 min

Alert escalation

Level 1 (Warning): Log it, review in daily check
Level 2 (Critical): Investigate immediately
Level 3 (Outage): All hands, coordinate in incident channel

Building the habit

Schedule it — Add a 5-minute recurring calendar block for morning reviews
Create a checklist — Use the checklist above or customize for your needs
Rotate responsibility — If you have a team, rotate the daily review role
Document findings — Keep a simple log of what you notice each day
Act on trends — Don't just observe — take action when you see patterns

Next steps

Using Filters Effectively — Master the filter system
Monitoring Failed Models — Track provider stability
Back to Best Practices — Return to the overview

Build a proactive log review habit