Skip to content

mateBarey/Rag-GEN-AI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

7 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Rag-GEN-AI

Python Weaviate DSPy Ollama Docker PyMuPDF License

A generalized Retrieval-Augmented Generation (RAG) framework that integrates multiple PDF knowledge sources with local LLMs via Weaviate vector search and DSPy chain-of-thought reasoning.


Table of Contents


Overview

Rag-GEN-AI is an abstract, modular class designed to integrate multiple information sources β€” primarily large PDF datasets β€” into the knowledge base of a local Large Language Model (LLM). It implements a full RAG pipeline leveraging:

  • Weaviate for semantic vector search and document storage
  • DSPy for structured chain-of-thought prompt engineering
  • Ollama for running local LLM inference (e.g., dolphin-llama3)
  • PyMuPDF for PDF text extraction
  • TextBlob for response quality evaluation via sentiment analysis

Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                      User Query                         β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                       β”‚
                       β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                  GeneralizedRAG                         β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚
β”‚  β”‚  WeaviateRM (Retriever)                           β”‚  β”‚
β”‚  β”‚  - Semantic near-text search                      β”‚  β”‚
β”‚  β”‚  - Top-K document chunk retrieval                 β”‚  β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚
β”‚                      β”‚                                  β”‚
β”‚                      β–Ό                                  β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚
β”‚  β”‚  CustomRAG (DSPy Module)                          β”‚  β”‚
β”‚  β”‚  - ChainOfThought reasoning                       β”‚  β”‚
β”‚  β”‚  - Technical detail enrichment                    β”‚  β”‚
β”‚  β”‚  - Self-reflection & quality scoring              β”‚  β”‚
β”‚  β”‚  - Keyword matching + sentiment analysis          β”‚  β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚
β”‚                      β”‚                                  β”‚
β”‚                      β–Ό                                  β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚
β”‚  β”‚  Response (with relevance-adjusted output)        β”‚  β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚               Infrastructure Layer                      β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚
β”‚  β”‚  Weaviate   β”‚  β”‚ Contextionaryβ”‚  β”‚  Ollama (LLM) β”‚  β”‚
β”‚  β”‚  (Docker)   β”‚  β”‚  (Docker)   β”‚  β”‚  (Local)       β”‚  β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Features

  • Multi-PDF Ingestion β€” Process and index multiple PDF documents in parallel using concurrent.futures
  • Semantic Vector Search β€” Leverage Weaviate's text2vec-contextionary for contextually relevant document retrieval
  • Chain-of-Thought Reasoning β€” DSPy-powered structured prompting for detailed, explainable answers
  • Self-Reflective Quality Control β€” Automatic response evaluation using keyword matching and sentiment analysis with configurable thresholds
  • Fully Local Pipeline β€” Run everything on your own hardware with Ollama β€” no API keys or cloud dependencies
  • Modular Architecture β€” Easily swap models, retrieval backends, or evaluation strategies
  • Docker-Composed Infrastructure β€” One-command setup for Weaviate + Contextionary

Prerequisites

Tool Version Purpose
Python >= 3.10 Runtime
Docker >= 20.10 Container runtime
Docker Compose >= 1.29 Service orchestration
Ollama latest Local LLM inference
Git >= 2.0 Version control

Installation

1. Clone the Repository

git clone https://github.com/your-username/Rag-GEN-AI.git
cd Rag-GEN-AI

2. Start Weaviate Services

docker-compose up -d

This launches:

  • Weaviate on http://localhost:8080 (REST) and localhost:50051 (gRPC)
  • Contextionary on localhost:9999 for text vectorization

3. Install Python Dependencies

pip install weaviate-client pymupdf dspy-ai textblob ollama

Or with a virtual environment:

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
pip install weaviate-client pymupdf dspy-ai textblob ollama

4. Pull the LLM Model via Ollama

ollama pull dolphin-llama3

5. Verify Weaviate is Running

curl http://localhost:8080/v1/.well-known/ready

Usage

Single PDF Source

from generalizedRagAgg import GeneralizedRAG

pdf_paths = ["path/to/tax_code.pdf"]

rag = GeneralizedRAG(
    model_name="TaxModel",
    model_input="dolphin-llama3",
    pdf_source_files=pdf_paths
)

answer = rag.ask_question("What are the tax implications of business expenses?")
print(answer)

Multiple PDF Sources (Aggregated Knowledge Base)

from generalizedRagAgg import GeneralizedRAG

pdf_paths = [
    "path/to/texas_business_law.pdf",
    "path/to/blacks_law_dictionary.pdf",
    "path/to/real_estate_law.pdf",
    "path/to/irs_code.pdf",
    "path/to/tax_liens_investing.pdf"
]

rag = GeneralizedRAG(
    model_name="AggregateModel",
    model_input="dolphin-llama3",
    pdf_source_files=pdf_paths
)

answer = rag.ask_question(
    "What is the tax strategy for a single-member LLC taxed as an S-Corp "
    "with gross income of $145k, expenses of $35k, and salary of $30k?"
)
print(answer)

Interactive Notebook

For a step-by-step walkthrough, open the Jupyter notebook:

jupyter notebook RagModelsetupTax.ipynb

Project Structure

Rag-GEN-AI/
β”œβ”€β”€ generalizedRagAgg.py       # Core RAG pipeline (GeneralizedRAG + WeaviateRM)
β”œβ”€β”€ generalizedRagtester.py    # Chunk presence verification tool
β”œβ”€β”€ RagModelsetupTax.ipynb     # Interactive notebook walkthrough
β”œβ”€β”€ docker-compose.yaml        # Weaviate + Contextionary services
β”œβ”€β”€ usc26@118-64.pdf           # Sample dataset (U.S. Tax Code Title 26)
└── README.md                  # This file

Configuration

Weaviate Connection

# Default: localhost
connection_params = {
    "url": "http://localhost:8080",
}

Chunking Strategy

# Adjust chunk size (default: 1000 characters)
chunks = self._chunk_text(text, chunk_size=1000)

Retrieval Parameters

# Adjust top-K results (default: 10 stored, 5 used per query)
retriever_model = WeaviateRM(model_name, weaviate_client, k=10)
context_results = self.retriever_model.retrieve(question, top_k=5)

Quality Threshold

# Adjust the relevance score threshold for self-reflection (default: 0.75)
if relevance_score < 0.75:
    answer = self.improve_response(answer, context, question)

Evaluation Weights

# Adjust keyword vs. sentiment scoring weights
overall_score = 0.6 * keyword_score + 0.4 * sentiment_score

Testing

Use the included RAGModelTester to verify document chunks were properly indexed:

from generalizedRagtester import RAGModelTester
import weaviate

client = weaviate.Client(url="http://localhost:8080", timeout_config=(30, 30))

tester = RAGModelTester(client, "AggregateModel")
results = tester.test_chunk_presence([
    "Sample text from document 1",
    "Sample text from document 2",
])

for chunk, present in results.items():
    print(f"Chunk: '{chunk}' β€” Indexed: {present}")

API Reference

GeneralizedRAG

Method Description
__init__(model_name, model_input, pdf_source_files) Initialize the RAG pipeline with Weaviate schema, PDF ingestion, and DSPy configuration
ask_question(question: str) Query the RAG system and receive a chain-of-thought answer with self-reflection

WeaviateRM

Method Description
__init__(class_name, weaviate_client, k) Initialize the retrieval model targeting a Weaviate class
retrieve(query, top_k) Perform semantic near-text search and return top-K document chunks

RAGModelTester

Method Description
__init__(weaviate_client, model_name) Initialize the tester for a specific Weaviate class
test_chunk_presence(test_chunks) Verify whether given text chunks exist in the Weaviate index

Contributing

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

Sources


License

This project is licensed under the MIT License. See the LICENSE file for details.

About

Use Dspy and weaviate on local llm llama to create RAG(retrieval augmented generation) content from a query

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors