Skip to content

Commit cf11be8

Browse files
committed
feat(retrieval): page-level BM25 + SPLADE retrieval over OCR'd reports
Self-contained PEP 723 script that indexes OCR'd .mmd reports page by page and ranks pages for a query, to measure whether a method retrieves the page containing a given KPI. BM25 + SPLADE on one pyserini/Lucene stack sharing a single docid space; ColBERTv2 deferred to a later script.
1 parent a43a856 commit cf11be8

1 file changed

Lines changed: 566 additions & 0 deletions

File tree

0 commit comments

Comments
 (0)