A generalized Retrieval-Augmented Generation (RAG) framework that integrates multiple PDF knowledge sources with local LLMs via Weaviate vector search and DSPy chain-of-thought reasoning.
- Overview
- Architecture
- Features
- Prerequisites
- Installation
- Usage
- Project Structure
- Configuration
- Testing
- API Reference
- Contributing
- Sources
- License
Rag-GEN-AI is an abstract, modular class designed to integrate multiple information sources β primarily large PDF datasets β into the knowledge base of a local Large Language Model (LLM). It implements a full RAG pipeline leveraging:
- Weaviate for semantic vector search and document storage
- DSPy for structured chain-of-thought prompt engineering
- Ollama for running local LLM inference (e.g.,
dolphin-llama3) - PyMuPDF for PDF text extraction
- TextBlob for response quality evaluation via sentiment analysis
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β User Query β
ββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β GeneralizedRAG β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β WeaviateRM (Retriever) β β
β β - Semantic near-text search β β
β β - Top-K document chunk retrieval β β
β βββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββ β
β β β
β βΌ β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β CustomRAG (DSPy Module) β β
β β - ChainOfThought reasoning β β
β β - Technical detail enrichment β β
β β - Self-reflection & quality scoring β β
β β - Keyword matching + sentiment analysis β β
β βββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββ β
β β β
β βΌ β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Response (with relevance-adjusted output) β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Infrastructure Layer β
β βββββββββββββββ βββββββββββββββ ββββββββββββββββββ β
β β Weaviate β β Contextionaryβ β Ollama (LLM) β β
β β (Docker) β β (Docker) β β (Local) β β
β βββββββββββββββ βββββββββββββββ ββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
- Multi-PDF Ingestion β Process and index multiple PDF documents in parallel using
concurrent.futures - Semantic Vector Search β Leverage Weaviate's
text2vec-contextionaryfor contextually relevant document retrieval - Chain-of-Thought Reasoning β DSPy-powered structured prompting for detailed, explainable answers
- Self-Reflective Quality Control β Automatic response evaluation using keyword matching and sentiment analysis with configurable thresholds
- Fully Local Pipeline β Run everything on your own hardware with Ollama β no API keys or cloud dependencies
- Modular Architecture β Easily swap models, retrieval backends, or evaluation strategies
- Docker-Composed Infrastructure β One-command setup for Weaviate + Contextionary
| Tool | Version | Purpose |
|---|---|---|
| Python | >= 3.10 | Runtime |
| Docker | >= 20.10 | Container runtime |
| Docker Compose | >= 1.29 | Service orchestration |
| Ollama | latest | Local LLM inference |
| Git | >= 2.0 | Version control |
git clone https://github.com/your-username/Rag-GEN-AI.git
cd Rag-GEN-AIdocker-compose up -dThis launches:
- Weaviate on
http://localhost:8080(REST) andlocalhost:50051(gRPC) - Contextionary on
localhost:9999for text vectorization
pip install weaviate-client pymupdf dspy-ai textblob ollamaOr with a virtual environment:
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
pip install weaviate-client pymupdf dspy-ai textblob ollamaollama pull dolphin-llama3curl http://localhost:8080/v1/.well-known/readyfrom generalizedRagAgg import GeneralizedRAG
pdf_paths = ["path/to/tax_code.pdf"]
rag = GeneralizedRAG(
model_name="TaxModel",
model_input="dolphin-llama3",
pdf_source_files=pdf_paths
)
answer = rag.ask_question("What are the tax implications of business expenses?")
print(answer)from generalizedRagAgg import GeneralizedRAG
pdf_paths = [
"path/to/texas_business_law.pdf",
"path/to/blacks_law_dictionary.pdf",
"path/to/real_estate_law.pdf",
"path/to/irs_code.pdf",
"path/to/tax_liens_investing.pdf"
]
rag = GeneralizedRAG(
model_name="AggregateModel",
model_input="dolphin-llama3",
pdf_source_files=pdf_paths
)
answer = rag.ask_question(
"What is the tax strategy for a single-member LLC taxed as an S-Corp "
"with gross income of $145k, expenses of $35k, and salary of $30k?"
)
print(answer)For a step-by-step walkthrough, open the Jupyter notebook:
jupyter notebook RagModelsetupTax.ipynbRag-GEN-AI/
βββ generalizedRagAgg.py # Core RAG pipeline (GeneralizedRAG + WeaviateRM)
βββ generalizedRagtester.py # Chunk presence verification tool
βββ RagModelsetupTax.ipynb # Interactive notebook walkthrough
βββ docker-compose.yaml # Weaviate + Contextionary services
βββ usc26@118-64.pdf # Sample dataset (U.S. Tax Code Title 26)
βββ README.md # This file
# Default: localhost
connection_params = {
"url": "http://localhost:8080",
}# Adjust chunk size (default: 1000 characters)
chunks = self._chunk_text(text, chunk_size=1000)# Adjust top-K results (default: 10 stored, 5 used per query)
retriever_model = WeaviateRM(model_name, weaviate_client, k=10)
context_results = self.retriever_model.retrieve(question, top_k=5)# Adjust the relevance score threshold for self-reflection (default: 0.75)
if relevance_score < 0.75:
answer = self.improve_response(answer, context, question)# Adjust keyword vs. sentiment scoring weights
overall_score = 0.6 * keyword_score + 0.4 * sentiment_scoreUse the included RAGModelTester to verify document chunks were properly indexed:
from generalizedRagtester import RAGModelTester
import weaviate
client = weaviate.Client(url="http://localhost:8080", timeout_config=(30, 30))
tester = RAGModelTester(client, "AggregateModel")
results = tester.test_chunk_presence([
"Sample text from document 1",
"Sample text from document 2",
])
for chunk, present in results.items():
print(f"Chunk: '{chunk}' β Indexed: {present}")| Method | Description |
|---|---|
__init__(model_name, model_input, pdf_source_files) |
Initialize the RAG pipeline with Weaviate schema, PDF ingestion, and DSPy configuration |
ask_question(question: str) |
Query the RAG system and receive a chain-of-thought answer with self-reflection |
| Method | Description |
|---|---|
__init__(class_name, weaviate_client, k) |
Initialize the retrieval model targeting a Weaviate class |
retrieve(query, top_k) |
Perform semantic near-text search and return top-K document chunks |
| Method | Description |
|---|---|
__init__(weaviate_client, model_name) |
Initialize the tester for a specific Weaviate class |
test_chunk_presence(test_chunks) |
Verify whether given text chunks exist in the Weaviate index |
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
- Weaviate DSPy + Llama3 Integration Recipe
- Weaviate Documentation
- DSPy Documentation
- Ollama Documentation
This project is licensed under the MIT License. See the LICENSE file for details.