Tracking Webhook Reliability

Overview

For async and event-driven workflows, webhooks are the lifeline between ModelRiver and your backend. If webhook delivery fails, your application is flying blind — it doesn't know the AI request completed. Regular monitoring of webhook delivery rates catches endpoint issues before they cascade.

Key metrics to track

Delivery success rate

Webhook success rate = Successful deliveries ÷ Total deliveries × 100
 
Target: > 99%
Warning: 95-99%
Critical: < 95%

Delivery duration

Monitor how long webhook deliveries take:

Duration	Meaning	Action
< 200ms	Excellent	No action
200-1000ms	Acceptable	Monitor
1000-5000ms	Slow	Optimize endpoint
> 5000ms	Risk of timeout	Immediate optimization

Retry rate

Retry rate = Delivery retries ÷ Total deliveries × 100
 
If retry rate > 5%, your endpoint has reliability issues.

Monitoring workflow

Daily check

Filter to Live mode in Request Logs
Look for requests with async/webhook activity
Check webhook delivery status in timelines:
- All green? → Healthy
- Any red? → Investigate immediately
Note delivery durations — are they increasing?

Weekly analysis

Count total webhook deliveries
Count failed deliveries
Calculate success rate
Identify repeat failures (same endpoint, same error)
Review delivery duration trends

Example weekly metrics:

Total webhook deliveries:    420
Successful:                  416 (99.0%)
Failed:                        4 (1.0%)
Retried:                       3
Avg delivery duration:       120ms
 
Failed delivery breakdown:
  Timeout (30s):               2
  Connection refused:          1
  HTTP 500:                    1

Common issues and solutions

Endpoint responding too slowly

Symptom: Delivery durations > 1 second, occasional timeouts.

Solution: Acknowledge the webhook immediately, then process asynchronously:

JAVASCRIPT

1// Respond in < 200ms, process in background
2app.post('/webhooks/modelriver', (req, res) => {
3  res.status(200).json({ received: true });
4  
5  // Queue for background processing
6  queue.add('process-webhook', req.body);
7});

Endpoint intermittently unavailable

Symptom: Scattered "Connection refused" errors.

Solution:

Add health monitoring to your webhook endpoint
Set up auto-restart on failure
Consider a load balancer for redundancy

Endpoint returning errors

Symptom: HTTP 500 responses from your endpoint.

Solution: Check your server logs for application errors. Common causes include:

Missing request handler for the webhook route
JSON parsing failures
Database connection issues during processing

Retry strategy

Automatic retries

ModelRiver automatically retries failed deliveries with backoff. Monitor retry outcomes:

Delivery attempt 1: Failed (timeout)
  → Auto-retry scheduled
Delivery attempt 2: Failed (timeout)
  → Auto-retry scheduled  
Delivery attempt 3: Success
  → Delivered after 2 retries

Manual retries

Use manual retry in Request Logs when:

Automatic retries exhausted but you've fixed the endpoint
You're within the 5-minute retry window
You want to test your endpoint fix immediately

Next steps

Separating Environments — Keep data clean
Webhook Delivery Monitoring — Detailed scenario guide
Back to Best Practices — Return to the overview

Track webhook delivery reliability