Documentation

Sentry + ModelRiver

Capture AI errors, track slow requests, and monitor provider failover events: all inside your existing Sentry project.

Overview

Sentry is the leading error tracking and performance monitoring platform. By instrumenting your ModelRiver calls with Sentry spans and breadcrumbs, you can catch AI errors, monitor latency, and track provider failover: alongside your existing application telemetry.


Quick start

Install dependencies

Bash
pip install sentry-sdk openai

Python setup

PYTHON
1import sentry_sdk
2from openai import OpenAI
3 
4sentry_sdk.init(
5 dsn="https://YOUR_SENTRY_DSN",
6 traces_sample_rate=1.0,
7)
8 
9client = OpenAI(
10 base_url="https://api.modelriver.com/v1",
11 api_key="mr_live_YOUR_API_KEY",
12)

Error capture

PYTHON
1import sentry_sdk
2from openai import OpenAI, APIError, RateLimitError
3 
4def chat(workflow: str, messages: list) -> str:
5 try:
6 response = client.chat.completions.create(
7 model=workflow,
8 messages=messages,
9 )
10 return response.choices[0].message.content
11 
12 except RateLimitError as e:
13 sentry_sdk.capture_exception(e)
14 sentry_sdk.set_context("modelriver", {
15 "workflow": workflow,
16 "error_type": "rate_limit",
17 })
18 raise
19 
20 except APIError as e:
21 sentry_sdk.capture_exception(e)
22 sentry_sdk.set_context("modelriver", {
23 "workflow": workflow,
24 "status_code": e.status_code,
25 "error_type": "api_error",
26 })
27 raise

Performance spans

PYTHON
1import sentry_sdk
2from openai import OpenAI
3 
4def chat_with_tracing(workflow: str, messages: list) -> str:
5 with sentry_sdk.start_span(op="ai.chat", description=f"ModelRiver: {workflow}") as span:
6 span.set_data("workflow", workflow)
7 span.set_data("message_count", len(messages))
8 
9 response = client.chat.completions.create(
10 model=workflow,
11 messages=messages,
12 )
13 
14 span.set_data("tokens", response.usage.total_tokens)
15 span.set_data("model", response.model)
16 span.set_data("finish_reason", response.choices[0].finish_reason)
17 
18 return response.choices[0].message.content

PYTHON
1import sentry_sdk
2 
3def chat_with_breadcrumbs(workflow: str, messages: list) -> str:
4 sentry_sdk.add_breadcrumb(
5 category="modelriver",
6 message=f"AI request to {workflow}",
7 level="info",
8 data={"workflow": workflow, "messages": len(messages)},
9 )
10 
11 response = client.chat.completions.create(
12 model=workflow,
13 messages=messages,
14 )
15 
16 sentry_sdk.add_breadcrumb(
17 category="modelriver",
18 message=f"AI response from {workflow}",
19 level="info",
20 data={
21 "tokens": response.usage.total_tokens,
22 "model": response.model,
23 },
24 )
25 
26 return response.choices[0].message.content

Node.js

JAVASCRIPT
1import * as Sentry from "@sentry/node";
2import OpenAI from "openai";
3 
4Sentry.init({ dsn: "https://YOUR_SENTRY_DSN", tracesSampleRate: 1.0 });
5 
6const client = new OpenAI({
7 baseURL: "https://api.modelriver.com/v1",
8 apiKey: "mr_live_YOUR_API_KEY",
9});
10 
11async function chat(workflow, messages) {
12 return Sentry.startSpan({ name: `ModelRiver: ${workflow}`, op: "ai.chat" }, async (span) => {
13 try {
14 const response = await client.chat.completions.create({
15 model: workflow,
16 messages,
17 });
18 
19 span.setData("tokens", response.usage.total_tokens);
20 return response.choices[0].message.content;
21 } catch (error) {
22 Sentry.captureException(error, {
23 contexts: { modelriver: { workflow } },
24 });
25 throw error;
26 }
27 });
28}

Best practices

  1. Capture all API errors: Catch APIError, RateLimitError, and AuthenticationError separately
  2. Add spans for performance: Track AI latency inside your existing traces
  3. Use breadcrumbs: Leaves an audit trail leading up to errors
  4. Set context: Include workflow name, token count, and model in Sentry context
  5. Alert on error spikes: Set up Sentry alerts for ModelRiver error rate increases

Next steps