Separating Environments – ModelRiver Docs

Overview

When production logs are mixed with test and playground data, it becomes nearly impossible to assess the true health of your AI application. ModelRiver automatically tags every request with its source environment, letting you filter cleanly and keep your analysis accurate.

How environment tagging works

Every request is tagged with a seed_batch prefix that identifies its source:

Prefix	Source	Description
`live:`	Production API	Real user requests via your API
`test:`	Test mode API	API calls with the `test_mode` flag
`pg:`	Console Playground	Production mode tests from the console
`pg_test_mode:`	Console Playground (Test)	Test mode runs from the console
`callback:`	Backend Callback	Callbacks from event-driven workflows
`pg_callback:`	Playground Callback	Simulated callbacks from console tests

Best practices for environment separation

1. Always filter before analyzing

Debugging production issues?   → Filter to "Live mode"
Reviewing test results?        → Filter to "Test mode"
Validating a workflow change?  → Filter to "Playground (Production)"
Testing workflow structure?    → Filter to "Playground (Test mode)"
Looking at everything?         → Filter to "All requests" (rare)

2. Use test mode for integration tests

When running automated tests or CI/CD pipelines:

Bash

# Set test_mode to true in your API call
curl -X POST https://api.modelriver.com/v1/chat/completions \
  -H "Authorization: Bearer $MODELRIVER_API_KEY" \
  -d '{
    "model": "your-workflow-id",
    "messages": [{"role": "user", "content": "test"}],
    "test_mode": true
  }'

This ensures test traffic:

Uses sample data instead of real providers (free)
Is tagged with test: prefix
Is filterable separately from production
Doesn't pollute your production analytics

3. Use playground for manual validation

Before deploying changes:

Test mode first — Verify workflow structure (free, sample data)
Production mode — Validate with real AI responses (uses credits)
Deploy — Push to production
Monitor Live mode — Watch for any issues

Each step produces logs in a different filter, keeping them organized.

4. Never analyze "All requests" for health metrics

When calculating error rates, latency, or costs:

✗ Error rate across "All requests" = 5.2%
  (Includes failed tests, which inflates the number)
 
✓ Error rate across "Live mode" = 1.8%
  (Actual production health)

Using "All requests" for metrics gives misleading results because test failures and playground experiments are counted alongside real production traffic.

Environment separation checklist

All CI/CD test calls use test_mode: true
Manual testing uses the Playground, not the production API
Health monitoring dashboards filter to "Live mode" only
Cost analysis filters to "Live mode" for accurate spending data
Debugging starts with the correct filter for the issue context

Next steps

Token Optimization — Optimize costs
Using Filters Effectively — Master the filter system
Back to Best Practices — Return to the overview

Keep your environments cleanly separated