Documentation

LangChain + ModelRiver

Swap ChatOpenAI for ModelRiver in one line. Every chain, agent, and RAG pipeline gets automatic failover, cost tracking, and structured outputs.

Overview

LangChain is Python's most popular LLM orchestration framework. Because LangChain's ChatOpenAI class accepts a custom base_url, you can point it at ModelRiver with zero additional dependencies.

What you get:

  • Every LangChain chain, agent, and tool call routes through ModelRiver
  • Automatic failover if your primary provider is down
  • Per-request cost and token tracking in Request Logs
  • Structured output enforcement at the workflow level

Quick start

Install dependencies

Bash
pip install langchain langchain-openai openai

Connect LangChain to ModelRiver

PYTHON
1from langchain_openai import ChatOpenAI
2 
3llm = ChatOpenAI(
4 openai_api_base="https://api.modelriver.com/v1",
5 openai_api_key="mr_live_YOUR_API_KEY",
6 model="my-chat-workflow", # ← your ModelRiver workflow name
7 temperature=0.7,
8 max_tokens=1000,
9)
10 
11response = llm.invoke("What is ModelRiver?")
12print(response.content)

That's it: two parameters changed from a standard ChatOpenAI setup.


Chains

Simple chain

PYTHON
1from langchain_core.prompts import ChatPromptTemplate
2 
3prompt = ChatPromptTemplate.from_messages([
4 ("system", "You are a helpful marketing copywriter."),
5 ("user", "{input}")
6])
7 
8chain = prompt | llm
9 
10response = chain.invoke({"input": "Write a tagline for an AI routing platform"})
11print(response.content)

Sequential chain

PYTHON
1from langchain_core.output_parsers import StrOutputParser
2 
3# Chain 1: Generate ideas
4idea_prompt = ChatPromptTemplate.from_messages([
5 ("system", "Generate 3 creative product name ideas."),
6 ("user", "Product: {product}")
7])
8 
9# Chain 2: Pick the best
10pick_prompt = ChatPromptTemplate.from_messages([
11 ("system", "Pick the best name and explain why in one sentence."),
12 ("user", "Options:\n{ideas}")
13])
14 
15idea_chain = idea_prompt | llm | StrOutputParser()
16pick_chain = pick_prompt | llm | StrOutputParser()
17 
18# Compose
19full_chain = (
20 {"ideas": idea_chain, "product": lambda x: x["product"]}
21 | pick_chain
22)
23 
24result = full_chain.invoke({"product": "AI-powered API gateway"})
25print(result)

Agents with tools

PYTHON
1from langchain_core.tools import tool
2from langchain.agents import create_tool_calling_agent, AgentExecutor
3from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
4 
5@tool
6def get_weather(location: str) -> str:
7 """Get the current weather for a location."""
8 # Your weather API call here
9 return f"It's 22°C and sunny in {location}."
10 
11@tool
12def search_web(query: str) -> str:
13 """Search the web for information."""
14 # Your search API call here
15 return f"Top result for '{query}': ModelRiver is an AI gateway..."
16 
17tools = [get_weather, search_web]
18 
19prompt = ChatPromptTemplate.from_messages([
20 ("system", "You are a helpful assistant with access to tools."),
21 MessagesPlaceholder("chat_history", optional=True),
22 ("user", "{input}"),
23 MessagesPlaceholder("agent_scratchpad"),
24])
25 
26agent = create_tool_calling_agent(llm, tools, prompt)
27executor = AgentExecutor(agent=agent, tools=tools, verbose=True)
28 
29result = executor.invoke({"input": "What's the weather in Paris?"})
30print(result["output"])

RAG pipeline

PYTHON
1from langchain_openai import OpenAIEmbeddings
2from langchain_community.vectorstores import FAISS
3from langchain_core.prompts import ChatPromptTemplate
4from langchain_core.runnables import RunnablePassthrough
5from langchain_core.output_parsers import StrOutputParser
6 
7# Embeddings can also go through ModelRiver if you have an embedding workflow
8embeddings = OpenAIEmbeddings(
9 openai_api_base="https://api.modelriver.com/v1",
10 openai_api_key="mr_live_YOUR_API_KEY",
11 model="my-embedding-workflow",
12)
13 
14# Create a vector store from documents
15texts = [
16 "ModelRiver routes AI requests to multiple providers.",
17 "Workflows configure provider, model, and fallbacks.",
18 "Structured outputs guarantee JSON schema compliance.",
19]
20vectorstore = FAISS.from_texts(texts, embeddings)
21retriever = vectorstore.as_retriever()
22 
23# RAG chain
24rag_prompt = ChatPromptTemplate.from_messages([
25 ("system", "Answer based on context:\n\n{context}"),
26 ("user", "{question}")
27])
28 
29rag_chain = (
30 {"context": retriever, "question": RunnablePassthrough()}
31 | rag_prompt
32 | llm
33 | StrOutputParser()
34)
35 
36answer = rag_chain.invoke("How does ModelRiver handle failover?")
37print(answer)

Streaming

PYTHON
1from langchain_openai import ChatOpenAI
2 
3llm = ChatOpenAI(
4 openai_api_base="https://api.modelriver.com/v1",
5 openai_api_key="mr_live_YOUR_API_KEY",
6 model="my-chat-workflow",
7 streaming=True,
8)
9 
10for chunk in llm.stream("Tell me a short story"):
11 print(chunk.content, end="", flush=True)

Structured outputs

Use LangChain's with_structured_output with a ModelRiver workflow that has a structured output schema:

PYTHON
1from pydantic import BaseModel, Field
2 
3class MovieReview(BaseModel):
4 title: str = Field(description="Movie title")
5 rating: float = Field(description="Rating out of 10")
6 summary: str = Field(description="One-sentence summary")
7 
8structured_llm = llm.with_structured_output(MovieReview)
9 
10review = structured_llm.invoke("Review the movie Inception")
11print(f"{review.title}: {review.rating}/10: {review.summary}")

Using different workflows per chain

One of ModelRiver's key advantages is routing different parts of your pipeline through different workflows (and therefore different models/providers):

PYTHON
1# Fast model for classification
2classifier = ChatOpenAI(
3 openai_api_base="https://api.modelriver.com/v1",
4 openai_api_key="mr_live_YOUR_API_KEY",
5 model="fast-classifier", # e.g., GPT-4o-mini
6)
7 
8# Powerful model for generation
9writer = ChatOpenAI(
10 openai_api_base="https://api.modelriver.com/v1",
11 openai_api_key="mr_live_YOUR_API_KEY",
12 model="deep-writer", # e.g., Claude 3.5 Sonnet
13)
14 
15# Each stage of your pipeline uses the optimal model
16classify_chain = classify_prompt | classifier | StrOutputParser()
17write_chain = write_prompt | writer | StrOutputParser()

Best practices

  1. One workflow per use case: Use different ModelRiver workflows for classification vs generation vs summarisation
  2. Enable structured outputs: Define schemas in ModelRiver to guarantee JSON shapes across chains
  3. Monitor in Request Logs: Every LangChain call appears in Observability with full metadata
  4. Use fallbacks: Configure backup providers in ModelRiver rather than LangChain's built-in fallback (centralised management)
  5. Stream for user-facing chains: Set streaming=True for interactive applications

Next steps