Skip to content

harden review readiness skill gates#600

Merged
bmdhodl merged 1 commit into
mainfrom
codex/skill-progression-hardening-20260612
Jun 12, 2026
Merged

harden review readiness skill gates#600
bmdhodl merged 1 commit into
mainfrom
codex/skill-progression-hardening-20260612

Conversation

@bmdhodl

@bmdhodl bmdhodl commented Jun 12, 2026

Copy link
Copy Markdown
Owner

Summary

  • Adds a review-readiness guard that keeps five recent PR-review skill gaps executable: source/fact ledgers, cross-platform concurrency proof, API collector data-shape tests, proof artifact integrity, and CI/workflow cost routing.
  • Expands the PR template with those five conditional gates and wires the guard into preflight plus make check via make review-readiness.
  • Hardens the Claude PR review workflow with pinned checkout/CLI version, shallow checkout, Python diff truncation, explicit untrusted-diff boundaries, and a 300s model-call timeout.
  • Saves the work/validation log under proof/skill-progression-2026-06-12/LOG.md.

Review Readiness

  • Public positioning claims have a source/fact ledger. N/A for this PR's copy; the gate is now enforced for future positioning PRs.
  • State, lock, file, or process-concurrency changes include cross-platform failure proof. N/A for runtime state; the gate is now enforced for future state changes.
  • External API collectors include response-shape, pagination, null, and partial-failure tests. N/A for collectors; the gate is now enforced for future collectors.
  • Proof artifacts include command, exit code, platform, and regenerated-after-review status. See proof/skill-progression-2026-06-12/LOG.md.
  • Workflow changes explain trigger scope, timeouts, concurrency, artifacts, and spend impact. Claude review now uses shallow checkout, bounded diff/model time, and pinned tools.

Proof

  • python scripts/review_readiness_guard.py -> passed
  • python -m pytest sdk/tests/test_review_readiness_guard.py sdk/tests/test_sdk_preflight.py sdk/tests/test_ci_guardrails.py -q -> 16 passed
  • python scripts/sdk_preflight.py -> passed
  • python -m ruff check sdk/agentguard/ scripts/generate_pypi_readme.py scripts/sdk_preflight.py scripts/sdk_release_guard.py scripts/ci_tools_requirements_guard.py scripts/review_readiness_guard.py -> passed
  • python scripts/ci_tools_requirements_guard.py -> passed
  • python scripts/generate_pypi_readme.py --check -> passed
  • python scripts/sdk_release_guard.py -> passed
  • python -m pytest sdk/tests/ -q -> 812 passed
  • python -m pytest sdk/tests/test_architecture.py -v -> 9 passed
  • python -m bandit -r sdk/agentguard/ -s B101,B110,B112,B311 -q -> passed with no findings
  • python -m pytest sdk/tests/ -v --cov=agentguard --cov-report=term-missing --cov-fail-under=80 -> 812 passed, 92.36% coverage
  • npm --prefix mcp-server ci && npm --prefix mcp-server test -> 10 passed, using a temporary npm cache because the default npm cache hit ENOSPC earlier
  • python -m pip install -e ./agentguard-mcp && cd agentguard-mcp && python -m ruff check agentguard_mcp tests && python -m pytest -> 15 passed

Risk And Rollback

Low. This is repo-process/workflow hardening plus tests and a proof log. Rollback is a straight revert of this PR.

Scope

  • Related ops/doc(s): NORTHSTAR, ROADMAP, DEFINITION_OF_DONE, PR template, Claude review workflow
  • Does this change the public API? No
  • Does this shift roadmap priority? No

Notes

@bmdhodl bmdhodl merged commit f98ba92 into main Jun 12, 2026
15 checks passed

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3046994629

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

# Pull the diff (cap size so we never blow the context window).
gh pr diff "$PR" --repo "$REPO" | head -c 200000 > /tmp/pr.diff
gh pr diff "$PR" --repo "$REPO" \
| python -c 'import sys; sys.stdout.buffer.write(sys.stdin.buffer.read()[:200000])' \

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Avoid buffering the entire PR diff before truncating

For very large PRs, this read()[:200000] still consumes the complete gh pr diff stream before applying the 200 KB cap, so the workflow can spend runner time/memory downloading and buffering a huge diff even though only the first chunk is sent to Claude. This regresses the previous streaming behavior of head -c; use a bounded read/streaming truncation that stops after the limit while still handling SIGPIPE cleanly.

Useful? React with 👍 / 👎.

@bmdhodl bmdhodl deleted the codex/skill-progression-hardening-20260612 branch June 12, 2026 15:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant