Skip to content

feat(retrieval): implement Hybrid Search (Vector + BM25) and ColBERT Reranking#56

Open
swamy18 wants to merge 11 commits intofeld-m:mainfrom
swamy18:main
Open

feat(retrieval): implement Hybrid Search (Vector + BM25) and ColBERT Reranking#56
swamy18 wants to merge 11 commits intofeld-m:mainfrom
swamy18:main

Conversation

@swamy18
Copy link
Copy Markdown

@swamy18 swamy18 commented Dec 19, 2025

This pull request introduces significant enhancements to the retrieval pipeline, including Hybrid Search and ColBERT reranking, along with production-ready configurations and a new user interface.

Key Changes:

  • Hybrid Retriever: Implemented a QueryFusionRetriever combining Vector search (ChromaDB) and BM25 for improved relevance.
  • ColBERT Reranking: Added ColBERT reranker to the retrieval pipeline for more precise document ranking.
  • Factory Pattern: Extended the existing registry/factory pattern for seamless integration of new retrieval components.
  • Main App: Created src/augmentation/app.py using Chainlit for a chat-based user interface.
  • Production Config: Added Pydantic-based configurations for Hybrid and ColBERT components.
  • CI/CD: Enhanced GitHub Actions workflow with mypy type checking.
  • Documentation: Updated README.md with detailed instructions on how to run the new hybrid search and app.

### How to use:
```bash
python -m src.augmentation.app --env hybrid

This factory class implements the Factory design pattern to create a hybrid retriever component that uses Query Fusion to combine multiple retrieval results.
This class defines the configuration parameters needed for initializing and operating the hybrid retriever, extending the base RetrieverConfiguration.
Add hybrid retrieval configuration with various settings.
…l and reranking featureses and usage instructions

Removed duplicate features and added new usage instructions for Hybrid Retrieval.
…ration for HybridRetrieverridRetrieverFactory

Refactor HybridRetrieverFactory to include BM25Retriever and update configuration handling.
…wkflow

Added a step for type checking using mypy.
feat(augmentation): add main app entry point for Chainlit UI
Updated README to enhance formatting and clarify usage instructions.docs: update README with correct CLI usage and hybrid search features
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant