Enhanced Self-RAG: Multi-hop Reasoning with Adaptive Hybrid Retrieval for Knowledge-Intensive Question Answering

Traditional RAG has a known issue: if you use a small top-k value, you may not get the desired result. Increasing k can broaden the retrieval scope, but the answer you expect might still not appear as the top result.

For example, in the medical corpus with k = 6, “migraine” did not appear at all. When k was increased to 10, “migraine” was included, but only ranked at position 7 instead of the top position. The relevancy problem in traditional RAG does not only depend on the number of embedded vectors and fixed retrieval size — it also suffers from limitations in contextual understanding and an overreliance on pure semantic similarity.

These problems were addressed by Self-RAG (Self-Reflective Retrieval-Augmented Generation) [https://github.com/AkariAsai/self-rag] by Asai et al., a framework where the model dynamically decides whether to retrieve, generates responses, and then self-reflects using special “reflection tokens” like ISREL (is relevant) or ISSUP (is supported), guiding whether the answer is acceptable or needs revision.

While the Self-RAG approach is a significant improvement over vanilla RAG, it still has limitations — particularly in handling multi-step or multi-fact queries — and could benefit from further enhancements in this area.

This repository solves that by implementing a Self-Retrieval-Augmented Generation (Self-RAG) pipeline with multihop reasoning capabilities, combining dense, sparse, and hybrid retrieval methods. The system dynamically adapts retrieval strategies based on query complexity, verifies evidence relevance, and synthesizes answers from multiple reasoning hops.

Features

Dense Retrieval using OpenAI text-embedding-3-small and FAISS
Sparse Retrieval using TF-IDF
Hybrid Retrieval with Reciprocal Rank Fusion (RRF)
Adaptive Retrieval Control: Dynamically adjusts number of documents retrieved per hop
Self-RAG Query Expansion to improve recall
LLM-based Support Verification: Ensures retrieved passages actually support candidate answers
Multihop Reasoning Engine for complex queries requiring multiple evidence sources

Repository Structure

├── enhanced_self_rag.py         # Main execution script
├── README.md                    # Project documentation

Installation

Clone the repository:

git clone https://github.com/rochiey/ehanced-self-rag.git
cd <repo-name>

Install dependencies:
```
pip install <module name>
```
Set up environment variables:
- Create a .env file in the project root:
```
OPENAI_API_KEY=your_openai_api_key_here
```
or you can just bypass the code and write your api_key directly in the code

Usage

Run the main script:

python enhanced_self_rag.py

The script will:

Load a predefined knowledge base (e.g., music, medical, history)
Build FAISS and TF-IDF indexes
Retrieve relevant passages using dense, sparse, or hybrid retrieval
Adapt retrieval size dynamically based on question complexity
Perform multihop reasoning to gather evidence
Verify evidence relevance
Generate a final, citation-backed answer

Dependencies

openai
faiss-cpu
scikit-learn
numpy
python-dotenv

Customization

Datasets: Replace the docs list in enhanced_self_rag.py with your own corpus.
Embedding Model: Change the embedding model in DenseIndex to other OpenAI embedding models.
Retrieval Method: Switch between dense, sparse, or hybrid in the retrieval_method parameter.

License

This project is licensed under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
LICENSE		LICENSE
enhanced_self_rag.py		enhanced_self_rag.py
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Enhanced Self-RAG: Multi-hop Reasoning with Adaptive Hybrid Retrieval for Knowledge-Intensive Question Answering

Features

Repository Structure

Installation

Usage

Dependencies

Customization

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Enhanced Self-RAG: Multi-hop Reasoning with Adaptive Hybrid Retrieval for Knowledge-Intensive Question Answering

Features

Repository Structure

Installation

Usage

Dependencies

Customization

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages