Multi-agent RAG

🎯 About This Repo

multi-agent-rag is a modular, extensible framework for building advanced Retrieval-Augmented Generation (RAG) systems powered by multiple collaborative AI agents. By combining modern LLM orchestration with efficient vector search and local model serving, it enables scalable, intelligent workflows that seamlessly retrieve, reason, and generate high-quality responses.

🛠️ Built With

The repository leverages a powerful, modern AI stack:

LlamaIndex – For structured data ingestion and advanced retrieval pipelines.
ChromaDB – As a lightweight, high-performance vector database.
Ollama – For running local LLMs dedicated to planning, reasoning, and criticizing.
CrossEncoder – To meticulously score and re-rank retrieved text chunks.
Pdfplumber – For advanced, high-fidelity text and table extraction from PDFs.
Typer – For building intuitive, production-ready CLI interfaces.
Rich – For beautiful, enhanced terminal output and easier debugging.

🚀 Key Features

Multi-Agent Collaboration: Employs specialized, collaborating agents to handle complex retrieval and reasoning tasks.
Modular & Extensible: Built from the ground up to let you easily swap models, vector databases, and orchestration tools.
Local & Scalable: Integrated with local model serving and optimized vector search for secure, high-performance, and cost-effective deployment.

🧩 Repository Structure

multi-agent-rag/
│
├── main.py
├── ingest.py
├── query.py
├── settings.py
├── embedding.py
├── llm.py
├── skill_registry.py
├── server.js
├── package.json
├── package-lock.json
├── public/
│   ├── index.html
│   ├── tifo.txt
├── agents/
│   ├── router.py
│   ├── retriever.py
│   ├── reranker.py
│   ├── reasoner.py
│   ├── planner.py
├── rag/
│   ├── index.py
│   ├── loader.py
├── vector_database/
├── files/
├── skills/
│   ├── rag_context_critic.md
│   ├── rag_context_qa.md
├── LICENSE
├── .gitignore
├── requirements.txt
└── README.md

🔮 Multi-agent RAG Pipeline

    Query Input
            ↓
    Retriever.retrieve()    ----    (Vector, Keyword, or Hybrid)
            ↓
    Top Vector Chunks
            ↓
    Deduplication           ----    (Removes redundant context early)
            ↓
    Cross-Encoder Reranker  ----    (Computes high-quality relevance)
            ↓
    Top Reranked Chunks
            ↓
    Prompt Construction     ----    (Context formatting & system prompts)
            ↓
    LLM Grounded Generation
            ↓
    Optional Critic
            ↓
    Answer & Citations

🔭 Agent Model Configuration

+----------+----------------+---------+-------------------------------------+-----------------------------------+
|  AGENT   |  MODEL         |  SIZE   |  STRENGTHS                          |  WEAKNESSES                       |
+----------+----------------+---------+-------------------------------------+-----------------------------------+
| Planner  | Llama 3 (8B)   | 4.7 GB  | - Excellent instruction following   | - Can be verbose                  |
|          |                |         | - Great at JSON/structured output   | - Adds unnecessary conversational |
|          |                |         | - Strong decomposition skills       |   pre-amble                       |
+----------+----------------+---------+-------------------------------------+-----------------------------------+
| Reasoner | Mistral (7B)   | 4.1 GB  | - Dense and efficient               | - Can be overly succinct          |
|          |                |         | - High logical "snappiness"         | - Might lack nuance in very       |
|          |                |         | - Excellent context management      |   complex logic                   |
+----------+----------------+---------+-------------------------------------+-----------------------------------+
| Critic   | Qwen 3 (8B)    | 5.2 GB  | - Exceptional fact-checking         | - Formatting defaults can vary    |
|          |                |         | - High logic performance            | - Different prompt sensitivity    |
|          |                |         | - Diverse training data perspective |   than Llama/Mistral              |
+----------+----------------+---------+-------------------------------------+-----------------------------------+

💻 Installation and Usage

Clone the repository:

git clone https://github.com/FlyingMatrix/multi-agent-rag.git
cd ./multi-agent-rag

Create and activate a virtual environment (optional):

python -m venv venv
source venv/bin/activate  # On Windows use `venv\Scripts\activate`

Install the required dependencies:
```
pip install -r requirements.txt
```

Pull Llama3 (8B), Mistral (7B), and Qwen 3 (8B) models locally:

ollama pull llama3
ollama pull mistral
ollama pull qwen3:8b
ollama list

Copy the target documents to be ingested into the designated folders below:
```
./files/pdf
./files/markdown
```
Ingest documents from the files folder and its subfolders to generate the vector database:
```
python main.py ingest ./files/pdfs
```
Query the multi-agent RAG system in command to generate answers:
```
python main.py query "<your query>"
```
Alternatively, access the multi-agent RAG system via a web-based UI:
```
node server.js
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Multi-agent RAG

🎯 About This Repo

🛠️ Built With

🚀 Key Features

🧩 Repository Structure

🔮 Multi-agent RAG Pipeline

🔭 Agent Model Configuration

💻 Installation and Usage

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 149 Commits
agents		agents
dev		dev
public		public
rag		rag
skills		skills
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
embedding.py		embedding.py
ingest.py		ingest.py
llm.py		llm.py
main.py		main.py
package-lock.json		package-lock.json
package.json		package.json
query.py		query.py
requirements.txt		requirements.txt
server.js		server.js
settings.py		settings.py
skill_registry.py		skill_registry.py

Folders and files

Latest commit

History

Repository files navigation

Multi-agent RAG

🎯 About This Repo

🛠️ Built With

🚀 Key Features

🧩 Repository Structure

🔮 Multi-agent RAG Pipeline

🔭 Agent Model Configuration

💻 Installation and Usage

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages