This project implements a Retrieval-Augmented Generation (RAG) system using multiple AI agents orchestrated by the PydanticAI library. It leverages Google's Gemini models for advanced language understanding and generation, Google's embedding API for semantic search capabilities, and Qdrant as a high-performance vector database for storing and retrieving knowledge.
The system employs a multi-agent approach where different agents handle specific tasks. A Router Agent first determines the best strategy (vector search vs. web search) to answer a user query. Based on this, either a specialized RAG Agent or a Web Search Agent takes over.
- Multi-Agent Design: Utilizes separate agents for routing, knowledge retrieval (RAG), and web searching, allowing for specialized logic and prompts.
- Intelligent Routing: A Router Agent uses a powerful Gemini model (e.g.,
gemini-1.5-pro) to decide whether to query the internal knowledge base or search the web based on the user's question. - Retrieval-Augmented Generation (RAG): A RAG Agent retrieves relevant text chunks from a Qdrant vector database using semantic similarity search (powered by Google Embeddings) and generates answers strictly based on the retrieved context.
- Web Search Capability: A Web Search Agent uses DuckDuckGo to find current information or general knowledge answers when the internal knowledge base is not suitable.
- PydanticAI Orchestration: Leverages PydanticAI for defining agents, managing tools, handling structured output (for routing), and simplifying interaction with language models.
- Google Cloud Integration: Natively uses Google Gemini large language models and Google's embedding API.
- Qdrant Vector Database: Employs Qdrant for efficient storage and fast retrieval of text embeddings.
- Configurable: Easily configure API keys, model names, Qdrant settings, and agent parameters via
.envandconfig.py. - Asynchronous: Built using Python's
asynciofor potentially concurrent operations.
-
Prerequisites:
- Python 3.8 or newer installed.
- Git installed (for cloning, optional if downloading).
- Docker installed and running (if using Qdrant locally via Docker).
-
Get the Code: Clone the repository or download the project files into a directory (e.g.,
Multi-Agent-RAG).git clone https://github.com/raoofaltaher/Multi-Agent-RAG # Or download/extract zip cd Multi-Agent-RAG
-
Create Virtual Environment: Isolate project dependencies using a virtual environment.
python -m venv venv
-
Activate Virtual Environment:
- Windows:
.\venv\Scripts\activate - macOS/Linux:
source venv/bin/activate(Your terminal prompt should now show(venv))
- Windows:
-
Install Dependencies: Install all required Python libraries.
pip install -r requirements.txt
-
Configure
.envFile:- Create a file named
.envin the project's root directory (Multi-Agent-RAG/). - Open
.envand add your Google API Key. This key is essential for both generating embeddings and running the Gemini models used by the agents.# --- .env --- GOOGLE_API_KEY="YOUR_ACTUAL_GOOGLE_API_KEY_HERE" # Optional: Uncomment and set if Qdrant is not at localhost:6333 # QDRANT_URL="http://your_qdrant_host:6333" # Optional: Uncomment and set if using Qdrant Cloud or local setup with authentication # QDRANT_API_KEY="your_qdrant_api_key_if_needed"
- CRITICAL: Replace
"YOUR_ACTUAL_GOOGLE_API_KEY_HERE"with your real key obtained from Google AI Studio. - Verify Permissions: Ensure the API key is active and the associated Google Cloud project has the "Generative Language API" enabled. Check under "APIs & Services" in your Google Cloud Console.
- Security: The
.gitignorefile prevents this file from being committed to Git. Keep your keys secure.
- Create a file named
-
Run Qdrant: Start a Qdrant vector database instance. For local development, Docker is recommended:
docker run -p 6333:6333 qdrant/qdrant
Ensure Qdrant is accessible at the URL specified in
.env(defaults tohttp://localhost:6333). Keep this terminal running in the background.
-
Ingest Data (Load Knowledge Base):
- Run the
ingest_data.pyscript to fetch data from a URL, generate embeddings, and load it into Qdrant. This requires a valid Google API key configured in.env. - Make sure your virtual environment is active (
(venv)prefix in prompt). - From the
Multi-Agent-RAGdirectory:# Ingest the default URL (currently Google Gemini Docs) python ingest_data.py # --- OR --- # Ingest a different URL (e.g., Meta Llama 3 blog) python ingest_data.py "https://ai.meta.com/blog/meta-llama-3/" # --- OR --- # Ingest another relevant page python ingest_data.py "url_of_your_choice"
- Run this script for each URL source you want to include in the RAG agent's knowledge base. Data from multiple runs will be added to the same Qdrant collection.
- You can run this command multiple times for all the links "Urls" you need.
- If you encounter API key errors here, re-verify your
.envfile and Google API key status/permissions.
- Run the
-
Run the Main Application (Chat Interface):
- Starts the interactive chat loop where you can ask questions.
- Make sure your virtual environment is active.
- From the
Multi-Agent-RAGdirectory:python main.py
- Wait for the system to initialize (Qdrant connection, Agent creation).
- You will see the
>prompt. Type your question and press Enter. - Example queries:
What embedding models does the Gemini API offer?(Tests RAG on ingested data)What is Qdrant?(Tests Web Search)Tell me a short story.(Tests Direct Answer)
- Observe the terminal logs to see the routing decision and which agent is executing.
- Type
quitor pressCtrl+C(orCtrl+D) to exit the application.
- Input: User types a query into the
main.pyprompt. - Routing:
main.pysends the query to theRouter Agent. This agent (usinggemini-1.5-pro) analyzes the query and returns a decision: use Vector Search or Web Search (or neither). - Execution Path:
- Vector Search:
main.pycalls theRAG Agent.- The
RAG Agent(usingpydantic-ai's auto-execution) triggers itsVectorSearchKnowledgeBasetool (tools.py). - The tool (
vector_search_tool) gets a query embedding (data_pipeline.py), searches Qdrant (vector_store.py) for relevant chunks, and returns the formatted text context. - The
RAG Agentreceives the context and generates a response based only on that context usinggemini-1.5-flash.
- The
- Web Search:
main.pycalls theWeb Search Agent.- The
Web Search Agenttriggers itsWebSearchCurrentEventstool (tools.py). - The tool (
web_search_tool) usesduckduckgo-searchto get results and returns formatted snippets. - The
Web Search Agentreceives the snippets and generates a response based only on those snippets usinggemini-1.5-flash.
- The
- Direct Answer:
main.pytemporarily reconfigures theWeb Search Agent(disabling tools, changing prompt) and runs it directly to get a general knowledge answer fromgemini-1.5-flash.
- Vector Search:
- Output:
main.pydisplays the final generated answer to the user.
.envFile: Primarily for secrets and environment-specific URLs.GOOGLE_API_KEY: Required.QDRANT_URL: Optional, defaults tohttp://localhost:6333.QDRANT_API_KEY: Optional, needed for authenticated Qdrant instances.
config.py: For application constants and settings derived from.envor defaults.*_MODEL_NAME: Specifies which Google Gemini models to use for different agents. Ensure these models are available to your API key.EMBEDDING_MODEL_NAME: Google embedding model to use.QDRANT_COLLECTION_NAME,VECTOR_SIZE,METRIC: Qdrant configuration. EnsureVECTOR_SIZEmatches theEMBEDDING_MODEL_NAME.CHUNK_TOKEN_SIZE,CHUNK_OVERLAP: Text splitting parameters.RAG_TOP_K,WEB_SEARCH_MAX_RESULTS: Retrieval limits.
400 API key not valid/ Embedding Errors: Double/triple-check theGOOGLE_API_KEYin.env. Verify it's active and has "Generative Language API" enabled in Google Cloud Console. Check project billing status.ImportError/ModuleNotFoundError: Ensure your virtual environment is activated and you've runpip install -r requirements.txt. Check for typos in import statements.- Qdrant Connection Error (
ConnectionRefusedError): Make sure your Qdrant Docker container (or service) is running. Verify theQDRANT_URLin.envorconfig.pymatches Qdrant's accessible address. Check firewalls. pydantic_aiErrors (Unknown model, etc.): Compatibility issues betweenpydantic-aiversion and how models/features are used. Ensurerequirements.txtinstalls a compatible set of libraries. The current code uses plain model name strings (gemini-1.5-pro-latest) passed to theAgent, which works withpydantic-aiv0.1.x.AsyncDDGSImportError: Theduckduckgo-searchversion installed might not have this class. The currenttools.pyuses the synchronousDDGSin an executor, which is more compatible.
- More Robust Error Handling: Add more specific
try...exceptblocks around API calls and tool executions. - Support Other Data Sources: Extend
data_pipeline.pyandingest_data.pyto handle PDFs, TXT files, databases, etc. - Advanced RAG: Implement query transformations, hybrid search (keyword + vector), re-ranking of results.
- Source Filtering: Modify RAG/vector search to allow filtering by the original
url_sourcestored in the payload. - Multiple Qdrant Collections: Use separate collections for distinct knowledge bases.
- Streaming Responses: Modify agents to yield tokens for a more interactive chat experience.
- User Interface: Build a web UI (e.g., using Flask, Streamlit, Gradio) instead of the console interface.
- Asynchronous Tools: If possible, use truly async versions of tools (like web search) if libraries support it, instead of
run_in_executor. - State Management: Implement more sophisticated conversation history management.
