Test workflows safely before production

Use playground and test mode to validate AI workflow changes without affecting production traffic or consuming unnecessary provider credits.

Overview

Before deploying workflow changes to production, you need confidence that everything works correctly. ModelRiver provides multiple testing environments — Playground (Production), Playground (Test mode), and Test mode — each with its own purpose. Request Logs let you review every test run in detail.


Testing environments explained

Playground (Production)

  • What it does: Runs your workflow with real provider API calls
  • When to use: Final validation before deploying, testing with real AI responses
  • Cost: Consumes provider credits (real API calls)
  • Log filter: Playground (Production)
  • Seed batch prefix: pg:

Playground (Test mode)

  • What it does: Runs your workflow with sample data from structured outputs
  • When to use: Testing workflow configuration and data flow without consuming credits
  • Cost: Free — no provider API calls
  • Log filter: Playground (Test mode)
  • Seed batch prefix: pg_test_mode:

Test mode (API)

  • What it does: Accepts API calls with the test mode flag, using sample data instead of real providers
  • When to use: Automated testing, CI/CD pipelines, integration test suites
  • Cost: Free — no provider API calls
  • Log filter: Test mode
  • Seed batch prefix: test:

Step-by-step testing workflow

1. Validate in test mode first

Start with Test mode to verify the basic workflow structure:

  1. Open your workflow in the console
  2. Click Test (the beaker icon) to run in test mode
  3. Navigate to Request Logs → filter to Playground (Test mode)
  4. Inspect the request and response to confirm correct data flow

Example test mode response:

JSON
1{
2 "choices": [
3 {
4 "message": {
5 "content": "{\"product_name\": \"Sample Product\", \"description\": \"This is a test response using structured output sample data.\"}",
6 "role": "assistant"
7 },
8 "finish_reason": "stop"
9 }
10 ],
11 "usage": {
12 "prompt_tokens": 0,
13 "completion_tokens": 0,
14 "total_tokens": 0
15 }
16}

Notice: Token usage is 0 in test mode because no real provider call was made. The response content comes from your structured output's sample data.

2. Test with real providers

Once you're confident in the workflow structure, test with real providers:

  1. Click Run in the Playground (this uses Production mode)
  2. Navigate to Request Logs → filter to Playground (Production)
  3. Inspect the full timeline — verify provider selection, token usage, and response quality
  4. Review the response body for accuracy and correctness

3. Validate webhook and callback flow

For async and event-driven workflows:

  1. Run the workflow in playground
  2. Check the timeline for webhook delivery status
  3. Verify the webhook payload matches your backend's expectations
  4. For event-driven workflows, confirm the callback was received
Playground Test Timeline
OpenAI gpt-4o-mini success 650ms
Webhook delivery success 32ms
Backend callback simulated

Note: In playground mode, backend callbacks are simulated — ModelRiver doesn't wait for your actual backend to respond. This lets you test the AI generation and webhook delivery independently.

4. Compare with production behavior

After deploying changes:

  1. Filter to Live mode to see real production requests
  2. Compare response quality, latency, and token usage with your playground tests
  3. Monitor for any unexpected failures or behavior changes
  4. If issues arise, compare the failing production request with your successful playground test

CI/CD integration example

Use the test mode API in your CI/CD pipeline:

Bash
# Run a test mode request
curl -X POST https://api.modelriver.com/v1/chat/completions \
-H "Authorization: Bearer $MODELRIVER_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "your-workflow-id",
"messages": [{"role": "user", "content": "Test message"}],
"test_mode": true
}'
 
# Check the response status
# Response will use sample data, not real providers

Then verify in Request Logs:

  1. Filter to Test mode
  2. Confirm the request was logged correctly
  3. Validate the response structure matches expectations

Tips for effective testing

  • Always test both happy path and edge cases — Test with various input types, empty messages, and long prompts
  • Verify structured outputs — If your workflow uses structured outputs, confirm the schema is enforced correctly
  • Test failover behavior — If possible, configure a test scenario where the primary provider fails to verify fallback works
  • Keep test data realistic — Use production-like prompts in your test data for accurate validation
  • Review token usage — Compare token usage between test and production to catch unexpected differences

Next steps