Skip to content

fix: use GraphQL-only issue PR lookup#774

Open
MkDev11 wants to merge 6 commits intoentrius:testfrom
MkDev11:fix/tighten-rest-pr-search-false-positives
Open

fix: use GraphQL-only issue PR lookup#774
MkDev11 wants to merge 6 commits intoentrius:testfrom
MkDev11:fix/tighten-rest-pr-search-false-positives

Conversation

@MkDev11
Copy link
Copy Markdown
Contributor

@MkDev11 MkDev11 commented Apr 24, 2026

Summary

REST issue/PR search matched the issue number as a bare text token, which can surface unrelated PRs whose title/body happen to contain that number. Rather than layering regex filtering on top of that imprecise source, this PR removes _search_issue_referencing_prs_rest and makes find_prs_for_issue rely only on GraphQL cross-reference timeline data.

Behavior now:

  • No PAT: returns an empty list instead of risking wrong REST search data, and the human CLI output tells users to set GITTENSOR_MINER_PAT.
  • PAT + GraphQL empty: returns the correct empty cross-reference set.
  • PAT + GraphQL error: returns empty after the existing GraphQL retry/backoff path.

Validator scoring remains unaffected because find_solver_from_cross_references already uses GraphQL-only cross-reference data and never used the REST fallback.

Related Issues

Closes #773

Type of Change

  • Bug fix
  • New feature
  • Refactor
  • Documentation
  • Other (describe below)

Testing

  • Tests added/updated
  • Manually tested

Ran:

  • uv run pytest tests/utils/test_github_api_tools.py tests/cli/test_issue_submission.py -> 82 passed
  • uv run ruff check gittensor/utils/github_api_tools.py gittensor/cli/issue_commands/helpers.py tests/utils/test_github_api_tools.py tests/cli/test_issue_submission.py -> passed

Checklist

  • Code follows project style guidelines
  • Self-review completed
  • Changes are documented (if applicable)

@xiao-xiao-mao xiao-xiao-mao Bot added the bug Something isn't working label Apr 24, 2026
@MkDev11 MkDev11 force-pushed the fix/tighten-rest-pr-search-false-positives branch 2 times, most recently from 3e45c5b to e2c4356 Compare April 24, 2026 19:49
@MkDev11 MkDev11 changed the title fix: verify REST PR search items reference the issue before returning fix: verify REST PR search items reference the searched repo's issue Apr 24, 2026
@MkDev11 MkDev11 force-pushed the fix/tighten-rest-pr-search-false-positives branch from 0fa849b to 21f48b6 Compare April 24, 2026 20:46
_search_issue_referencing_prs_rest built its query with the issue number
as a bare token, so GitHub's search matched any PR whose title or body
contained that number for unrelated reasons (version bumps, percentages,
counts) and every returned item was mapped straight into PRInfo. Users
saw unrelated PRs in 'gitt issues submissions'.

Add _pr_references_issue(title, body, issue_number, repo) and call it in
the REST item loop. The predicate matches only GitHub-rendered reference
forms that point at the searched repo:

  - bare hash:  entrius#42
  - qualified:  owner/repo#42
  - URL proto:  https://github.com/owner/repo/issues/42 (also www./m. hosts)
  - URL bare:   github.com/owner/repo/pull/42 (no protocol)

Qualified cross-repo refs (other/project#42) and URLs to different owners
are rejected. Lookbehinds exclude word chars, dots, and hyphens so
suffix-owner-name ambiguities (foo-owner/repo#42, sub-github.qkg1.top/...)
do not false-match — GitHub owner names may contain hyphens. The
compiled pattern is cached via functools.lru_cache since the filter
runs per item, up to 50 items per search.

The GraphQL path (CROSS_REFERENCED_EVENT timeline) was already
semantically verified and is untouched. The REST path now converges on
the same guarantee. find_prs_for_issue cascade, retry/backoff, and the
PRInfo shape are unchanged. Validator scoring is unaffected —
find_solver_from_cross_references never touches the REST fallback.
@MkDev11 MkDev11 force-pushed the fix/tighten-rest-pr-search-false-positives branch from 21f48b6 to 7719ec7 Compare April 24, 2026 21:16
Copy link
Copy Markdown
Collaborator

@anderdc anderdc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The premise is right (REST bare-token search returns false positives) but the fix keeps the broken path and adds a regex on top. Cleaner: delete _search_issue_referencing_prs_rest and have find_prs_for_issue short-circuit on GraphQL alone.

  • No-PAT CLI users get an empty list instead of a wrong one — wrong data is worse than no data for an inspection command.
  • PAT users on GraphQL-empty get correct empty (the cross-reference set is genuinely empty, which is what GraphQL just told us).
  • PAT users on GraphQL-error get empty after the existing retry/backoff in execute_graphql_query / get_github_graphql_query (max_attempts=8 in the latter) — no need for a second data source.

Issue body confirms scoring isn't affected (find_solver_from_cross_references uses GraphQL only and never falls through), which bounds the blast radius. Drop the regex helper, the filter, and the two new test classes; keep the existing find_prs_for_issue tests and confirm they still pass under the simpler cascade.

Optional UX improvement: surface a "set GITTENSOR_MINER_PAT" hint in gitt issues submissions when no PAT is set, so the empty-list case is actionable instead of silent.

@MkDev11 MkDev11 changed the title fix: verify REST PR search items reference the searched repo's issue fix: use GraphQL-only issue PR lookup Apr 29, 2026
@MkDev11 MkDev11 requested a review from anderdc April 29, 2026 21:11
@MkDev11
Copy link
Copy Markdown
Contributor Author

MkDev11 commented Apr 29, 2026

The premise is right (REST bare-token search returns false positives) but the fix keeps the broken path and adds a regex on top. Cleaner: delete _search_issue_referencing_prs_rest and have find_prs_for_issue short-circuit on GraphQL alone.

  • No-PAT CLI users get an empty list instead of a wrong one — wrong data is worse than no data for an inspection command.
  • PAT users on GraphQL-empty get correct empty (the cross-reference set is genuinely empty, which is what GraphQL just told us).
  • PAT users on GraphQL-error get empty after the existing retry/backoff in execute_graphql_query / get_github_graphql_query (max_attempts=8 in the latter) — no need for a second data source.

Issue body confirms scoring isn't affected (find_solver_from_cross_references uses GraphQL only and never falls through), which bounds the blast radius. Drop the regex helper, the filter, and the two new test classes; keep the existing find_prs_for_issue tests and confirm they still pass under the simpler cascade.

Optional UX improvement: surface a "set GITTENSOR_MINER_PAT" hint in gitt issues submissions when no PAT is set, so the empty-list case is actionable instead of silent.

just pushed the update, can you please review again?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] _search_issue_referencing_prs_rest surfaces false-positive PRs — bare-number query with no #N verification

2 participants