Protection

Rate limits that guard every request

Set limits per user, IP, or project. Stop runaway costs and abuse before they hit your providers.

User + IP keys Burst + sustained Project-level controls Logged throttles

Visual

How limits shape traffic

Requests enter buckets, burst is smoothed, healthy traffic passes through.

01

Incoming request

Identified by user, IP, or project

02

Identify rate limit scopes

Check user + IP + project limits

03

Sliding window check

Count requests in time window

04

Burst guard

Smooths traffic spikes

05

Decision point

Allowed → forward to providers

Over limit → 429 + Retry-After

06

Analytics + request logs

Throttle events recorded

Policy
limits:
  user: 120 req/min
  ip: 300 req/min
  project: 1_000 req/min
strategy: "token-bucket"
action: "429 + retry-after"
log_throttles: true
              
1

Enforced before provider calls

Over-limit traffic is stopped up front to avoid wasted tokens and 429s.

2

Granular scopes

Combine user, IP, and project ceilings to smooth both burst and sustained traffic.

3

Audit every throttle

Throttled events hit analytics and request logs with timing and identifiers.

Multi-scope

User · IP · Project

Stack limits for layered protection.

Throttle feedback

429 + Retry-After

Clients get clear backoff guidance.

Visibility

Tracked

Analytics show who hit limits and when.

Scroll the guardrails

01 · Detect

Identify caller by user, IP, and project before provider calls.

02 · Enforce

Apply burst + sustained buckets; respond with 429 and retry-after.

03 · Observe

Log throttles to analytics with identifiers and timing.

04 · Tune

Adjust ceilings per plan, project, or user cohort.

Use cases

  • Public API keys with per-IP protections.
  • Multi-tenant apps needing project-level quotas.
  • Freemium plans with tight burst limits.

What’s unique

  • Enforced before provider calls to save tokens.
  • Analytics + logs show who was throttled and why.
  • Works alongside failover, streaming, and webhooks.

Programmatic access

Rate limits are enforced automatically per project

POST https://api.modelriver.com/v1/ai
Authorization: Bearer mr_live_your_key

{
  "workflow": "user-query",
  "messages": [
    { "role": "user", "content": "..." }
  ]
}

// When rate limited, you receive:
{
  "error": {
    "message": "Rate limit exceeded",
    "retry_after": 60
  }
}

Configure limits per project in the console. Requests are enforced before provider calls to save tokens. Throttle events appear in analytics.

Ship safe by default

Combine limits with failover, structured outputs, and webhooks for resilient, observable traffic.