Overview
When production logs are mixed with test and playground data, it becomes nearly impossible to assess the true health of your AI application. ModelRiver automatically tags every request with its source environment, letting you filter cleanly and keep your analysis accurate.
How environment tagging works
Every request is tagged with a seed_batch prefix that identifies its source:
| Prefix | Source | Description |
|---|---|---|
live: | Production API | Real user requests via your API |
test: | Test mode API | API calls with the test_mode flag |
pg: | Console Playground | Production mode tests from the console |
pg_test_mode: | Console Playground (Test) | Test mode runs from the console |
callback: | Backend Callback | Callbacks from event-driven workflows |
pg_callback: | Playground Callback | Simulated callbacks from console tests |
Best practices for environment separation
1. Always filter before analyzing
Debugging production issues? → Filter to "Live mode"Reviewing test results? → Filter to "Test mode"Validating a workflow change? → Filter to "Playground (Production)"Testing workflow structure? → Filter to "Playground (Test mode)"Looking at everything? → Filter to "All requests" (rare)2. Use test mode for integration tests
When running automated tests or CI/CD pipelines:
# Set test_mode to true in your API callcurl -X POST https://api.modelriver.com/v1/chat/completions \ -H "Authorization: Bearer $MODELRIVER_API_KEY" \ -d '{ "model": "your-workflow-id", "messages": [{"role": "user", "content": "test"}], "test_mode": true }'This ensures test traffic:
- Uses sample data instead of real providers (free)
- Is tagged with
test:prefix - Is filterable separately from production
- Doesn't pollute your production analytics
3. Use playground for manual validation
Before deploying changes:
- Test mode first — Verify workflow structure (free, sample data)
- Production mode — Validate with real AI responses (uses credits)
- Deploy — Push to production
- Monitor Live mode — Watch for any issues
Each step produces logs in a different filter, keeping them organized.
4. Never analyze "All requests" for health metrics
When calculating error rates, latency, or costs:
✗ Error rate across "All requests" = 5.2% (Includes failed tests, which inflates the number) ✓ Error rate across "Live mode" = 1.8% (Actual production health)Using "All requests" for metrics gives misleading results because test failures and playground experiments are counted alongside real production traffic.
Environment separation checklist
- All CI/CD test calls use
test_mode: true - Manual testing uses the Playground, not the production API
- Health monitoring dashboards filter to "Live mode" only
- Cost analysis filters to "Live mode" for accurate spending data
- Debugging starts with the correct filter for the issue context
Next steps
- Token Optimization — Optimize costs
- Using Filters Effectively — Master the filter system
- Back to Best Practices — Return to the overview