Enhanced Self-RAG: Multi-hop Reasoning with Adaptive Hybrid Retrieval for Knowledge-Intensive Question Answering
Traditional RAG has a known issue: if you use a small top-k value, you may not get the desired result. Increasing k can broaden the retrieval scope, but the answer you expect might still not appear as the top result.
For example, in the medical corpus with k = 6, “migraine” did not appear at all. When k was increased to 10, “migraine” was included, but only ranked at position 7 instead of the top position. The relevancy problem in traditional RAG does not only depend on the number of embedded vectors and fixed retrieval size — it also suffers from limitations in contextual understanding and an overreliance on pure semantic similarity.
These problems were addressed by Self-RAG (Self-Reflective Retrieval-Augmented Generation) [https://github.com/AkariAsai/self-rag] by Asai et al., a framework where the model dynamically decides whether to retrieve, generates responses, and then self-reflects using special “reflection tokens” like ISREL (is relevant) or ISSUP (is supported), guiding whether the answer is acceptable or needs revision.
While the Self-RAG approach is a significant improvement over vanilla RAG, it still has limitations — particularly in handling multi-step or multi-fact queries — and could benefit from further enhancements in this area.
This repository solves that by implementing a Self-Retrieval-Augmented Generation (Self-RAG) pipeline with multihop reasoning capabilities, combining dense, sparse, and hybrid retrieval methods. The system dynamically adapts retrieval strategies based on query complexity, verifies evidence relevance, and synthesizes answers from multiple reasoning hops.
- Dense Retrieval using OpenAI
text-embedding-3-smalland FAISS - Sparse Retrieval using TF-IDF
- Hybrid Retrieval with Reciprocal Rank Fusion (RRF)
- Adaptive Retrieval Control: Dynamically adjusts number of documents retrieved per hop
- Self-RAG Query Expansion to improve recall
- LLM-based Support Verification: Ensures retrieved passages actually support candidate answers
- Multihop Reasoning Engine for complex queries requiring multiple evidence sources
├── enhanced_self_rag.py # Main execution script
├── README.md # Project documentation
-
Clone the repository:
git clone https://github.com/rochiey/ehanced-self-rag.git cd <repo-name>
-
Install dependencies:
pip install <module name>
-
Set up environment variables:
- Create a
.envfile in the project root:OPENAI_API_KEY=your_openai_api_key_here
or you can just bypass the code and write your api_key directly in the code
- Create a
Run the main script:
python enhanced_self_rag.pyThe script will:
- Load a predefined knowledge base (e.g., music, medical, history)
- Build FAISS and TF-IDF indexes
- Retrieve relevant passages using dense, sparse, or hybrid retrieval
- Adapt retrieval size dynamically based on question complexity
- Perform multihop reasoning to gather evidence
- Verify evidence relevance
- Generate a final, citation-backed answer
openaifaiss-cpuscikit-learnnumpypython-dotenv
- Datasets: Replace the
docslist inenhanced_self_rag.pywith your own corpus. - Embedding Model: Change the embedding model in
DenseIndexto other OpenAI embedding models. - Retrieval Method: Switch between
dense,sparse, orhybridin theretrieval_methodparameter.
This project is licensed under the MIT License.