Overview
For async and event-driven workflows, webhooks are the lifeline between ModelRiver and your backend. If webhook delivery fails, your application is flying blind — it doesn't know the AI request completed. Regular monitoring of webhook delivery rates catches endpoint issues before they cascade.
Key metrics to track
Delivery success rate
Webhook success rate = Successful deliveries ÷ Total deliveries × 100 Target: > 99%Warning: 95-99%Critical: < 95%Delivery duration
Monitor how long webhook deliveries take:
| Duration | Meaning | Action |
|---|---|---|
| < 200ms | Excellent | No action |
| 200-1000ms | Acceptable | Monitor |
| 1000-5000ms | Slow | Optimize endpoint |
| > 5000ms | Risk of timeout | Immediate optimization |
Retry rate
Retry rate = Delivery retries ÷ Total deliveries × 100 If retry rate > 5%, your endpoint has reliability issues.Monitoring workflow
Daily check
- Filter to Live mode in Request Logs
- Look for requests with async/webhook activity
- Check webhook delivery status in timelines:
- All green? → Healthy
- Any red? → Investigate immediately
- Note delivery durations — are they increasing?
Weekly analysis
- Count total webhook deliveries
- Count failed deliveries
- Calculate success rate
- Identify repeat failures (same endpoint, same error)
- Review delivery duration trends
Example weekly metrics:
Total webhook deliveries: 420Successful: 416 (99.0%)Failed: 4 (1.0%)Retried: 3Avg delivery duration: 120ms Failed delivery breakdown: Timeout (30s): 2 Connection refused: 1 HTTP 500: 1Common issues and solutions
Endpoint responding too slowly
Symptom: Delivery durations > 1 second, occasional timeouts.
Solution: Acknowledge the webhook immediately, then process asynchronously:
1// Respond in < 200ms, process in background2app.post('/webhooks/modelriver', (req, res) => {3 res.status(200).json({ received: true });4 5 // Queue for background processing6 queue.add('process-webhook', req.body);7});Endpoint intermittently unavailable
Symptom: Scattered "Connection refused" errors.
Solution:
- Add health monitoring to your webhook endpoint
- Set up auto-restart on failure
- Consider a load balancer for redundancy
Endpoint returning errors
Symptom: HTTP 500 responses from your endpoint.
Solution: Check your server logs for application errors. Common causes include:
- Missing request handler for the webhook route
- JSON parsing failures
- Database connection issues during processing
Retry strategy
Automatic retries
ModelRiver automatically retries failed deliveries with backoff. Monitor retry outcomes:
Delivery attempt 1: Failed (timeout) → Auto-retry scheduledDelivery attempt 2: Failed (timeout) → Auto-retry scheduled Delivery attempt 3: Success → Delivered after 2 retriesManual retries
Use manual retry in Request Logs when:
- Automatic retries exhausted but you've fixed the endpoint
- You're within the 5-minute retry window
- You want to test your endpoint fix immediately
Next steps
- Separating Environments — Keep data clean
- Webhook Delivery Monitoring — Detailed scenario guide
- Back to Best Practices — Return to the overview