Skip to content

fix(mirror-scan): don't cache zero when scoring data unavailable#837

Open
YB0y wants to merge 1 commit intoentrius:testfrom
YB0y:fix/mirror-scan-scoring-data-unavailable-no-cache
Open

fix(mirror-scan): don't cache zero when scoring data unavailable#837
YB0y wants to merge 1 commit intoentrius:testfrom
YB0y:fix/mirror-scan-scoring-data-unavailable-no-cache

Conversation

@YB0y
Copy link
Copy Markdown

@YB0y YB0y commented Apr 28, 2026

Summary

Closes #836.

_resolve_solving_pr_score in gittensor/validator/issue_discovery/mirror_scan.py previously treated a mirror file response with scoring_data_stored=False the same as a real scored response: it tokenized the (typically empty) file list, computed (base_score=0, token_score=0), and wrote that result into the per-cycle cross-miner solving-PR cache.

That cached zero is not a legitimate score — scoring_data_stored=False is the mirror's explicit signal that file scoring data is unavailable (file ingestion / backfill pending). Caching it poisons every later issue in the same validator round that references the same solving PR: each one hits the cache, fails the MIN_TOKEN_SCORE_FOR_BASE_SCORE gate, and earns no discovery score even after mirror backfill completes.

This also contradicted the cache contract documented at mirror_scan.py:73-76, which says failed file lookups are not cached so later lookups can retry.

Fix

After client.get_pr_files(...) returns, check the response's scoring_data_stored flag. When it's False, treat the result the same way as the existing MirrorRequestError branch: return None, increment cache_stats.fetch_failures, log enough context to identify the repo / solving PR / issue, and leave the cache untouched.

This is a narrow cache-correctness fix:

Test plan

Two regression tests added to TestCacheStats in tests/validator/issue_discovery/test_mirror_scan.py:

…lable

When the mirror returns MirrorPullRequestFilesResponse with
scoring_data_stored=False, _resolve_solving_pr_score previously tokenized
the (typically empty) file list and wrote a CachedSolvingPR(0, 0) into
the per-cycle cross-miner cache. That poisons every later issue in the
same round that references the same solving PR — they hit the cached
zero, fail the MIN_TOKEN_SCORE_FOR_BASE_SCORE gate, and earn no
discovery score even after mirror backfill completes.

Treat scoring_data_stored=False as an availability signal (matching the
existing MirrorRequestError branch and the sibling OSS scoring path):
return None, increment cache_stats.fetch_failures, log enough context
to identify repo/PR/issue, and leave the cache untouched so a later
miner in the same cycle can retry.

Closes entrius#836
@xiao-xiao-mao xiao-xiao-mao Bot added the bug Something isn't working label Apr 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] Mirror issue discovery caches unavailable solving-PR file data as a zero score

1 participant