Integration testing without surprises
Test with the same settings as production. Free playground testing so you can ship with confidence.
Visual
Testing flow at a glance
From test request to production-configured playground output, without guesswork.
Source
Test request
staging or CLI
Testing mode
Production config
same routing, limits
Playground
Free + not logged
provider usage applies
Structured output
Schemas + samples
validate before prod
Promote
Ship with confidence
same config, real users
workflow: "testing_review"
mode: "testing"
playground_cost: "free_on_modelriver"
logging: "not_logged"
provider_usage: "billed_by_provider"
schema: "book_review_schema"
Match production behavior
Same routing, limits, and structured outputs as live traffic.
Test for free in playground
ModelRiver covers playground requests; provider billing still applies.
Validate structured outputs
Use schemas and samples to verify shape, tool calls, and safety gates.
Playground cost
Free on us
We cover playground requests; provider usage remains billable.
Config parity
Matches prod
Testing mode runs the exact production configuration and limits.
Output confidence
Schema-first
Validate against provided schemas before releasing to customers.
01 · Prepare
Mirror production configs in testing mode with the same providers.
02 · Exercise
Run test requests through the production-configured playground for free.
03 · Validate
Check structured outputs, schemas, and tool calls before rollout.
04 · Promote
Flip to production users with the same config you tested.
When to use
- ● Before promoting new workflows or tools to production traffic.
- ● Validating structured outputs and tool calls against schemas.
- ● Teams wanting parity with production limits without touching user quota.
What you get
- ● Testing mode that mirrors production routing, limits, and retries.
- ● Free, not-logged playground runs; provider usage still applies.
- ● Schema-first validation to ensure outputs land exactly as expected.
Ship tested workflows with confidence
Use testing mode plus the production-configured playground to validate outputs, then roll forward knowing nothing changes for real users.