Documentation

Haystack + ModelRiver

Production-ready NLP pipelines backed by ModelRiver. Automatic failover, cost tracking, and structured outputs for every Haystack component.

Overview

Haystack is Deepset's open-source framework for building production NLP systems: search engines, Q&A systems, and conversational AI. Its OpenAIChatGenerator component accepts a custom API base URL, making ModelRiver integration seamless.

What you get:

  • Every Haystack generator call routes through ModelRiver
  • Automatic failover during document processing pipelines
  • Token and cost tracking per pipeline run
  • Provider switching without redeploying your pipeline

Quick start

Install dependencies

Bash
pip install haystack-ai

Connect Haystack to ModelRiver

PYTHON
1from haystack.components.generators.chat import OpenAIChatGenerator
2from haystack.dataclasses import ChatMessage
3 
4generator = OpenAIChatGenerator(
5 api_base_url="https://api.modelriver.com/v1",
6 api_key="mr_live_YOUR_API_KEY",
7 model="my-chat-workflow",
8)
9 
10messages = [ChatMessage.from_user("What is ModelRiver?")]
11response = generator.run(messages=messages)
12print(response["replies"][0].text)

RAG pipeline

PYTHON
1from haystack import Pipeline
2from haystack.components.generators.chat import OpenAIChatGenerator
3from haystack.components.builders import ChatPromptBuilder
4from haystack.dataclasses import ChatMessage
5 
6generator = OpenAIChatGenerator(
7 api_base_url="https://api.modelriver.com/v1",
8 api_key="mr_live_YOUR_API_KEY",
9 model="my-chat-workflow",
10)
11 
12prompt_builder = ChatPromptBuilder()
13 
14rag_template = [
15 ChatMessage.from_system("Answer the question based on the following context:\n{{ context }}"),
16 ChatMessage.from_user("{{ question }}")
17]
18 
19pipe = Pipeline()
20pipe.add_component("prompt_builder", prompt_builder)
21pipe.add_component("llm", generator)
22pipe.connect("prompt_builder.prompt", "llm.messages")
23 
24result = pipe.run({
25 "prompt_builder": {
26 "template": rag_template,
27 "context": "ModelRiver routes AI requests across multiple providers with automatic failover.",
28 "question": "How does ModelRiver handle failover?"
29 }
30})
31 
32print(result["llm"]["replies"][0].text)

Streaming

PYTHON
1from haystack.components.generators.chat import OpenAIChatGenerator
2from haystack.dataclasses import ChatMessage
3 
4generator = OpenAIChatGenerator(
5 api_base_url="https://api.modelriver.com/v1",
6 api_key="mr_live_YOUR_API_KEY",
7 model="my-chat-workflow",
8 streaming_callback=lambda chunk: print(chunk.content, end="", flush=True),
9)
10 
11messages = [ChatMessage.from_user("Tell me a short story")]
12generator.run(messages=messages)

Embeddings

PYTHON
1from haystack.components.embedders import OpenAITextEmbedder, OpenAIDocumentEmbedder
2 
3# For individual texts
4text_embedder = OpenAITextEmbedder(
5 api_base_url="https://api.modelriver.com/v1",
6 api_key="mr_live_YOUR_API_KEY",
7 model="my-embedding-workflow",
8)
9 
10result = text_embedder.run(text="ModelRiver is an AI gateway")
11print(f"Embedding dimension: {len(result['embedding'])}")
12 
13# For documents
14doc_embedder = OpenAIDocumentEmbedder(
15 api_base_url="https://api.modelriver.com/v1",
16 api_key="mr_live_YOUR_API_KEY",
17 model="my-embedding-workflow",
18)

Best practices

  1. Use separate workflows for embedding vs generation components
  2. Monitor pipeline costs in Request Logs: document processing can be token-heavy
  3. Configure fallbacks for pipelines that must not fail
  4. Stream for interactive Q&A and chatbot pipelines

Next steps