Documentation

Weaviate + ModelRiver

Open-source vector search powered by ModelRiver embeddings. Automatic failover for both embedding generation and RAG queries.

Overview

Weaviate is an open-source vector search engine that supports custom vectorisers. By routing embedding and generative calls through ModelRiver, you get provider failover across your entire search pipeline.


Quick start

Install dependencies

Bash
pip install weaviate-client openai

Setup

PYTHON
1import weaviate
2from openai import OpenAI
3 
4# ModelRiver client
5client = OpenAI(
6 base_url="https://api.modelriver.com/v1",
7 api_key="mr_live_YOUR_API_KEY",
8)
9 
10# Weaviate client
11wv_client = weaviate.connect_to_local() # or connect_to_weaviate_cloud()

Create collection with ModelRiver embeddings

PYTHON
1def embed(texts: list[str]) -> list[list[float]]:
2 """Generate embeddings via ModelRiver."""
3 response = client.embeddings.create(
4 model="my-embedding-workflow",
5 input=texts,
6 )
7 return [d.embedding for d in response.data]
8 
9# Create collection
10collection = wv_client.collections.create(
11 name="Documents",
12 properties=[
13 weaviate.classes.config.Property(name="text", data_type=weaviate.classes.config.DataType.TEXT),
14 weaviate.classes.config.Property(name="source", data_type=weaviate.classes.config.DataType.TEXT),
15 ],
16)

Ingest documents

PYTHON
1documents = [
2 {"text": "ModelRiver routes AI requests across providers.", "source": "docs"},
3 {"text": "Workflows configure provider and fallback settings.", "source": "docs"},
4 {"text": "Structured outputs guarantee JSON schema compliance.", "source": "docs"},
5]
6 
7# Generate embeddings and insert
8texts = [doc["text"] for doc in documents]
9vectors = embed(texts)
10 
11with collection.batch.dynamic() as batch:
12 for doc, vector in zip(documents, vectors):
13 batch.add_object(
14 properties=doc,
15 vector=vector,
16 )

Semantic search + RAG

PYTHON
1def search_and_answer(question: str, top_k: int = 3) -> str:
2 """Search Weaviate, then generate an answer via ModelRiver."""
3 
4 # Embed the query
5 query_vector = embed([question])[0]
6 
7 # Search Weaviate
8 results = collection.query.near_vector(
9 near_vector=query_vector,
10 limit=top_k,
11 return_properties=["text", "source"],
12 )
13 
14 # Build context
15 context = "\n\n".join([obj.properties["text"] for obj in results.objects])
16 
17 # Generate answer
18 response = client.chat.completions.create(
19 model="my-chat-workflow",
20 messages=[
21 {"role": "system", "content": f"Answer based on context:\n\n{context}"},
22 {"role": "user", "content": question},
23 ],
24 )
25 
26 return response.choices[0].message.content
27 
28answer = search_and_answer("How does ModelRiver handle failover?")
29print(answer)

Best practices

  1. Separate embedding and chat workflows: Scale and tune independently
  2. Use batch imports for large datasets: Weaviate's batch API is much faster
  3. Store source metadata: Enables citation and provenance tracking
  4. Monitor embedding costs: Track in Request Logs

Next steps