fix: use GraphQL-only issue PR lookup#774
Conversation
3e45c5b to
e2c4356
Compare
0fa849b to
21f48b6
Compare
_search_issue_referencing_prs_rest built its query with the issue number as a bare token, so GitHub's search matched any PR whose title or body contained that number for unrelated reasons (version bumps, percentages, counts) and every returned item was mapped straight into PRInfo. Users saw unrelated PRs in 'gitt issues submissions'. Add _pr_references_issue(title, body, issue_number, repo) and call it in the REST item loop. The predicate matches only GitHub-rendered reference forms that point at the searched repo: - bare hash: entrius#42 - qualified: owner/repo#42 - URL proto: https://github.com/owner/repo/issues/42 (also www./m. hosts) - URL bare: github.com/owner/repo/pull/42 (no protocol) Qualified cross-repo refs (other/project#42) and URLs to different owners are rejected. Lookbehinds exclude word chars, dots, and hyphens so suffix-owner-name ambiguities (foo-owner/repo#42, sub-github.qkg1.top/...) do not false-match — GitHub owner names may contain hyphens. The compiled pattern is cached via functools.lru_cache since the filter runs per item, up to 50 items per search. The GraphQL path (CROSS_REFERENCED_EVENT timeline) was already semantically verified and is untouched. The REST path now converges on the same guarantee. find_prs_for_issue cascade, retry/backoff, and the PRInfo shape are unchanged. Validator scoring is unaffected — find_solver_from_cross_references never touches the REST fallback.
21f48b6 to
7719ec7
Compare
…-search-false-positives # Conflicts: # gittensor/utils/github_api_tools.py
anderdc
left a comment
There was a problem hiding this comment.
The premise is right (REST bare-token search returns false positives) but the fix keeps the broken path and adds a regex on top. Cleaner: delete _search_issue_referencing_prs_rest and have find_prs_for_issue short-circuit on GraphQL alone.
- No-PAT CLI users get an empty list instead of a wrong one — wrong data is worse than no data for an inspection command.
- PAT users on GraphQL-empty get correct empty (the cross-reference set is genuinely empty, which is what GraphQL just told us).
- PAT users on GraphQL-error get empty after the existing retry/backoff in
execute_graphql_query/get_github_graphql_query(max_attempts=8 in the latter) — no need for a second data source.
Issue body confirms scoring isn't affected (find_solver_from_cross_references uses GraphQL only and never falls through), which bounds the blast radius. Drop the regex helper, the filter, and the two new test classes; keep the existing find_prs_for_issue tests and confirm they still pass under the simpler cascade.
Optional UX improvement: surface a "set GITTENSOR_MINER_PAT" hint in gitt issues submissions when no PAT is set, so the empty-list case is actionable instead of silent.
just pushed the update, can you please review again? |
Summary
REST issue/PR search matched the issue number as a bare text token, which can surface unrelated PRs whose title/body happen to contain that number. Rather than layering regex filtering on top of that imprecise source, this PR removes
_search_issue_referencing_prs_restand makesfind_prs_for_issuerely only on GraphQL cross-reference timeline data.Behavior now:
GITTENSOR_MINER_PAT.Validator scoring remains unaffected because
find_solver_from_cross_referencesalready uses GraphQL-only cross-reference data and never used the REST fallback.Related Issues
Closes #773
Type of Change
Testing
Ran:
uv run pytest tests/utils/test_github_api_tools.py tests/cli/test_issue_submission.py-> 82 passeduv run ruff check gittensor/utils/github_api_tools.py gittensor/cli/issue_commands/helpers.py tests/utils/test_github_api_tools.py tests/cli/test_issue_submission.py-> passedChecklist