Keep your environments cleanly separated

Test noise obscures real issues. Use ModelRiver's environment separation features to ensure you're always looking at the right data.

Overview

When production logs are mixed with test and playground data, it becomes nearly impossible to assess the true health of your AI application. ModelRiver automatically tags every request with its source environment, letting you filter cleanly and keep your analysis accurate.


How environment tagging works

Every request is tagged with a seed_batch prefix that identifies its source:

PrefixSourceDescription
live:Production APIReal user requests via your API
test:Test mode APIAPI calls with the test_mode flag
pg:Console PlaygroundProduction mode tests from the console
pg_test_mode:Console Playground (Test)Test mode runs from the console
callback:Backend CallbackCallbacks from event-driven workflows
pg_callback:Playground CallbackSimulated callbacks from console tests

Best practices for environment separation

1. Always filter before analyzing

Debugging production issues? Filter to "Live mode"
Reviewing test results? Filter to "Test mode"
Validating a workflow change? Filter to "Playground (Production)"
Testing workflow structure? Filter to "Playground (Test mode)"
Looking at everything? Filter to "All requests" (rare)

2. Use test mode for integration tests

When running automated tests or CI/CD pipelines:

Bash
# Set test_mode to true in your API call
curl -X POST https://api.modelriver.com/v1/chat/completions \
-H "Authorization: Bearer $MODELRIVER_API_KEY" \
-d '{
"model": "your-workflow-id",
"messages": [{"role": "user", "content": "test"}],
"test_mode": true
}'

This ensures test traffic:

  • Uses sample data instead of real providers (free)
  • Is tagged with test: prefix
  • Is filterable separately from production
  • Doesn't pollute your production analytics

3. Use playground for manual validation

Before deploying changes:

  1. Test mode first — Verify workflow structure (free, sample data)
  2. Production mode — Validate with real AI responses (uses credits)
  3. Deploy — Push to production
  4. Monitor Live mode — Watch for any issues

Each step produces logs in a different filter, keeping them organized.

4. Never analyze "All requests" for health metrics

When calculating error rates, latency, or costs:

Error rate across "All requests" = 5.2%
(Includes failed tests, which inflates the number)
 
Error rate across "Live mode" = 1.8%
(Actual production health)

Using "All requests" for metrics gives misleading results because test failures and playground experiments are counted alongside real production traffic.


Environment separation checklist

  • All CI/CD test calls use test_mode: true
  • Manual testing uses the Playground, not the production API
  • Health monitoring dashboards filter to "Live mode" only
  • Cost analysis filters to "Live mode" for accurate spending data
  • Debugging starts with the correct filter for the issue context

Next steps