Track webhook delivery reliability

Failed webhook deliveries mean your backend misses critical AI responses. Monitor delivery rates, catch endpoint issues early, and use retry strategies effectively.

Overview

For async and event-driven workflows, webhooks are the lifeline between ModelRiver and your backend. If webhook delivery fails, your application is flying blind — it doesn't know the AI request completed. Regular monitoring of webhook delivery rates catches endpoint issues before they cascade.


Key metrics to track

Delivery success rate

Webhook success rate = Successful deliveries ÷ Total deliveries × 100
 
Target: > 99%
Warning: 95-99%
Critical: < 95%

Delivery duration

Monitor how long webhook deliveries take:

DurationMeaningAction
< 200msExcellentNo action
200-1000msAcceptableMonitor
1000-5000msSlowOptimize endpoint
> 5000msRisk of timeoutImmediate optimization

Retry rate

Retry rate = Delivery retries ÷ Total deliveries × 100
 
If retry rate > 5%, your endpoint has reliability issues.

Monitoring workflow

Daily check

  1. Filter to Live mode in Request Logs
  2. Look for requests with async/webhook activity
  3. Check webhook delivery status in timelines:
    • All green? → Healthy
    • Any red? → Investigate immediately
  4. Note delivery durations — are they increasing?

Weekly analysis

  1. Count total webhook deliveries
  2. Count failed deliveries
  3. Calculate success rate
  4. Identify repeat failures (same endpoint, same error)
  5. Review delivery duration trends

Example weekly metrics:

Total webhook deliveries: 420
Successful: 416 (99.0%)
Failed: 4 (1.0%)
Retried: 3
Avg delivery duration: 120ms
 
Failed delivery breakdown:
Timeout (30s): 2
Connection refused: 1
HTTP 500: 1

Common issues and solutions

Endpoint responding too slowly

Symptom: Delivery durations > 1 second, occasional timeouts.

Solution: Acknowledge the webhook immediately, then process asynchronously:

JAVASCRIPT
1// Respond in < 200ms, process in background
2app.post('/webhooks/modelriver', (req, res) => {
3 res.status(200).json({ received: true });
4
5 // Queue for background processing
6 queue.add('process-webhook', req.body);
7});

Endpoint intermittently unavailable

Symptom: Scattered "Connection refused" errors.

Solution:

  • Add health monitoring to your webhook endpoint
  • Set up auto-restart on failure
  • Consider a load balancer for redundancy

Endpoint returning errors

Symptom: HTTP 500 responses from your endpoint.

Solution: Check your server logs for application errors. Common causes include:

  • Missing request handler for the webhook route
  • JSON parsing failures
  • Database connection issues during processing

Retry strategy

Automatic retries

ModelRiver automatically retries failed deliveries with backoff. Monitor retry outcomes:

Delivery attempt 1: Failed (timeout)
Auto-retry scheduled
Delivery attempt 2: Failed (timeout)
Auto-retry scheduled
Delivery attempt 3: Success
Delivered after 2 retries

Manual retries

Use manual retry in Request Logs when:

  • Automatic retries exhausted but you've fixed the endpoint
  • You're within the 5-minute retry window
  • You want to test your endpoint fix immediately

Next steps