Skip to content

Commit a5e3434

Browse files
rex993claude
andcommitted
Fix score preservation in chat sources
Preserve similarity scores from ChunkSource objects when enriching sources via batch_retrieve_chunks. Previously, scores were lost when get_chunks_by_id returned hardcoded 0.0 scores. - Create score mapping from original ChunkSource objects - Apply preserved scores to retrieved chunks - Sort chunks by score descending for consistent relevance ordering Fixes issue where chat page showed 0.0 scores while search page showed correct similarity scores. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
1 parent c9d4381 commit a5e3434

1 file changed

Lines changed: 18 additions & 0 deletions

File tree

core/services/document_service.py

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -563,6 +563,24 @@ async def batch_retrieve_chunks(
563563
logger.error(f"Error during parallel chunk retrieval: {e}", exc_info=True)
564564
return []
565565

566+
# Create a mapping of original scores from ChunkSource objects (O(n) time)
567+
score_map = {
568+
(source.document_id, source.chunk_number): source.score
569+
for source in authorized_sources
570+
if source.score is not None
571+
}
572+
573+
# Apply original scores to the retrieved chunks (O(m) time with O(1) lookups)
574+
for chunk in chunks:
575+
key = (chunk.document_id, chunk.chunk_number)
576+
if key in score_map:
577+
chunk.score = score_map[key]
578+
logger.debug(f"Restored score {chunk.score} for chunk {key}")
579+
580+
# Sort chunks by score in descending order (highest score first)
581+
chunks.sort(key=lambda x: x.score, reverse=True)
582+
logger.debug(f"Sorted {len(chunks)} chunks by score")
583+
566584
# Convert to chunk results
567585
results = await self._create_chunk_results(auth, chunks)
568586
logger.info(f"Batch retrieved {len(results)} chunks out of {len(chunk_ids)} requested")

0 commit comments

Comments
 (0)