Multi-query RAG

Pipeline that aggregates data from multiple database providers.

Dynamiq Team

January 23, 2025

Chat

Finance

Use template

Less than 1 min read

Multi-query RAG

Pipeline that aggregates data from multiple database providers.

Dynamiq Team

January 23, 2025

Chat

Finance

Use template

What is multi-query RAG?

The Multi-query RAG template enables the integration of data from multiple sources by rephrasing queries, enhancing the generation of more accurate and contextually relevant responses.

Key features of the multi-query RAG workflow

Generates multiple query variations to capture a broader range of data.
Leverages multiple data sources to retrieve information.
Enables personalized, domain-specific content generation.

Who can benefit from the multi-query RAG template?

Businesses can improve decision-making by retrieving insights from diverse data sources.
Customer Support Teams can provide more accurate and contextually relevant responses by integrating data from various systems.
Internal Teams can enrich company chats by accessing knowledge stored across multiple internal sources.

How does the multi-query RAG template operate?

Two versions of a query are generated.
Each query is converted into vector representations (embeddings).
The generated embeddings are used to search across multiple knowledge bases (Weaviate and Pinecone).
A response is generated by combining information from two data sources.

Customizing the multi-query RAG workflow for your needs

Integrate additional data sources for more comprehensive insights.
Setup streaming for better user experience.
Switch to alternative models for embedding or response generation. Use different storage solutions (such as PgVector, Qdrant, Chroma, Milvus).
Update prompts to ensure the tone and style remain accurate and aligned with user needs.

Performance metrics and monitoring

Monitor the latency and accuracy of retrieved documents, as well as the tone and correctness of the final responses, using both custom and prebuilt metrics.
Monitor number of tokens and cost of inference.