[CI/CD Assessment] CI/CD Pipelines and Integration Tests Gap Assessment #4663

2026-06-10T06:12:16Z

github-actions[bot]
Bot Jun 10, 2026

📊 Current CI/CD Pipeline Status

The repository has a comprehensive and mature CI/CD system with 19 non-agentic workflows and 40+ agentic workflows. At a high level:

All actions are SHA-pinned (supply chain security ✅)
Multi-tier test architecture: unit → integration → smoke → build-test
Both classic YAML workflows and AI-driven agentic (.md) workflows co-exist
CodeQL, npm audit (SARIF), and AI Security Guard run on every PR
Performance benchmarking infrastructure exists but is schedule-only

Recent workflow success rates (from last 20 runs per workflow):

Workflow	Success Rate	Notes
Build Verification	~100%	Stable
Integration Tests	~100%	Stable
Test Coverage	~100%	Stable
Smoke Claude	~50%	Flaky
Smoke Gemini	0%	Broken
Smoke Copilot BYOK AOAI	~50%	Flaky
Documentation Maintainer	0%	Broken

✅ Existing Quality Gates

On Every PR

Check	Workflow	Type
TypeScript build (Node 20 + 22)	`build.yml`	Required
ESLint linting	`build.yml` + `lint.yml`	Required
Markdownlint	`lint.yml`	Required
TypeScript type checking	`test-integration.yml`	Required
Unit tests + coverage report	`test-coverage.yml`	Required
CodeQL (JS/TS + Actions)	`codeql.yml`	Required
npm audit (high/critical)	`dependency-audit.yml`	Required
API proxy unit tests	`build.yml`	Required
CLI proxy unit tests	`build.yml`	Required
Integration tests (all 5 jobs)	`test-integration-suite.yml`	Required
Chroot integration tests (4 jobs)	`test-chroot.yml`	Required
Setup action tests	`test-action.yml`	Required
Example scripts test	`test-examples.yml`	Required
Semantic PR title	`pr-title.yml`	Required
AI Security Guard	`security-guard.md`	Agentic (required)
AI Contribution Check	`contribution-check.md`	Agentic (required)
Link check	`link-check.yml`	Only on `.md` changes
Smoke tests (Claude/Copilot/Codex/Gemini)	`smoke-*.md`	Reaction-triggered
Multi-language build tests	`build-test.md`	Agentic, all PRs

Scheduled / Post-Merge Only

Performance benchmarking (daily)
Security review & threat modelling (daily)
Dependency security monitoring (daily)
Config consistency auditing (weekdays)
Duplicate code detection (daily)
Refactoring scanner (daily)
Token usage analysis (daily)
Docs deployment (on main push)

🔍 Identified Gaps

🔴 High Priority

1. Critically low unit test coverage on core modules
cli.ts has 0% coverage and docker-manager.ts has 18% coverage — yet these are the two most critical files in the codebase. The overall coverage threshold of 38% statements / 30% branches is too low for a security-critical tool.

2. Integration test CI coverage gaps
The INTEGRATION-TESTS.md heat map shows that several integration test categories have no CI job running them on PRs:

Domain/Network tests (50 tests) → ❌ no CI job
Protocol/Security tests (100 tests) → ❌ no CI job
Container/Ops tests (45 tests) → ❌ no CI job

These tests exist in tests/integration/ but are only run if triggered manually or via the integration suite's specific job patterns.

3. No container image vulnerability scanning on PR path
Docker images (squid, agent, api-proxy) are built for integration tests but never scanned for vulnerabilities (CVEs) during PR checks. A vulnerable base image could ship undetected.

4. No enforcement of required status checks (branch protection)
There is no evidence of branch protection rules requiring specific checks to pass before merge. Without required checks, failing CI doesn't block merging.

5. Smoke tests are not automatic PR gates
Smoke tests for Claude, Copilot, Codex, and Gemini require emoji reaction triggers — they are not automatic PR gates. A breaking change to the firewall can merge without any live smoke test passing.

🟡 Medium Priority

6. --block-domains feature has zero test coverage
The domain deny-list feature shows ❌ across all test types (unit, integration, CI, smoke, build-test). This is a critical security feature with no automated validation.

7. --env-all flag has no test coverage
Listed as ❌ across all test types in the heat map. No test verifies this flag works or doesn't leak credentials.

8. Performance regression not gated on PRs
The performance benchmark runs daily on schedule but is not part of the PR pipeline. A 2× slowdown in container startup time could merge undetected.

9. Documentation build not validated on PRs
deploy-docs.yml only triggers on push to main. A docs-site breaking change can merge without catching build failures.

10. Docker warning stub tests are skipped
The heat map notes ❌ ** for Docker warning stub — tests exist but are skipped. These should either be fixed or explicitly marked as known gaps.

11. Smoke Gemini is broken (0% success rate)
Smoke Gemini has failed in all recent runs. This represents a permanent blind spot in integration validation for the Gemini engine path.

12. No mutation testing
With 38% coverage, it is unclear how much of that coverage is meaningful. Mutation testing (e.g., Stryker) would reveal whether tests actually catch bugs vs. merely executing code paths.

🟢 Low Priority

13. Link check only triggers on .md file changes
Broken links in code comments or documentation embedded in non-.md files won't be caught.

14. No IPv6 integration test in CI
IPv6 tests exist in tests/integration/ipv6.test.ts but the CI matrix does not include a job running them.

15. No artifact size monitoring
There is no check to prevent dist/ bundle size from growing unexpectedly.

16. No changelog/release notes validation on PRs
Automated release notes generation exists (update-release-notes.md) but there is no PR-level check to ensure commits are well-formed for release automation.

17. No docs site accessibility checks
The Astro Starlight docs site has no automated accessibility (a11y) checks in CI.

📋 Actionable Recommendations

1. Raise coverage thresholds and add per-file minimums

Issue: cli.ts (0%) and docker-manager.ts (18%) are untested
Solution: Add per-file Jest coverage thresholds in jest.config.js for the two critical files; raise global thresholds to 50% statements / 40% branches
Complexity: Low
Impact: High — forces incremental test writing for core modules

2. Add container image scanning to PR pipeline

Issue: Docker images are built but not scanned
Solution: Add a job using docker/scout-action or aquasecurity/trivy-action after the Build local containers step in test-integration-suite.yml
Complexity: Low

Solution snippet:

- name: Scan agent image
  uses: aquasecurity/trivy-action@xxx
  with:
    image-ref: ghcr.io/github/gh-aw-firewall/agent:latest
    severity: HIGH,CRITICAL
    exit-code: '1'

Impact: High — catches base image CVEs before release

3. Add `--block-domains` and `--env-all` integration tests

Issue: Critical security features have no test coverage
Solution: Create tests/integration/block-domains.test.ts and tests/integration/env-all.test.ts; include in the appropriate CI job pattern
Complexity: Medium
Impact: High — closes security coverage gap

4. Add docs site build check to PR pipeline

Issue: Docs-site breaks can only be caught post-merge
Solution: Add a pull_request trigger to deploy-docs.yml that builds (but does not deploy) the site:
```
pull_request:
  branches: [main]
  paths:
    - 'docs-site/**'
```
Complexity: Low
Impact: Medium

5. Add performance regression gate on PRs

Issue: Startup time regressions can merge silently
Solution: Add a lightweight perf gate job in test-integration-suite.yml that runs npm run benchmark with a single iteration and fails if startup time exceeds a threshold (e.g., 120% of baseline stored in benchmarks/history.json)
Complexity: Medium
Impact: Medium

6. Fix or formally disable Smoke Gemini

Issue: 0% success rate makes it noise rather than signal
Solution: Investigate the Gemini integration failure, fix the root cause, or disable the workflow until Gemini support is restored
Complexity: Low (to disable), Medium (to fix)
Impact: Medium — restores confidence in smoke test suite

7. Enable branch protection rules with required checks

Issue: CI failures may not block merging
Solution: Enable branch protection on main with required status checks for: Build Verification, TypeScript Type Check, Test Coverage, Integration Tests, CodeQL
Complexity: Low (configuration change)
Impact: High — prevents broken code from merging

8. Add unskipped Docker warning stub tests

Issue: Skipped tests give false coverage signal
Solution: Investigate why docker-warning-stub tests are skipped; either fix them or delete them and file a tracking issue
Complexity: Low
Impact: Medium

📈 Metrics Summary

Metric	Value
Total workflows (YAML)	19
Total agentic workflows (.md)	40+
Workflows triggering on PRs	17
Unit test files	60
Integration test files	35
Unit tests (approx.)	~200
Integration tests (approx.)	~265
Statement coverage	38.39%
Branch coverage	31.78%
Statement threshold	38%
Branch threshold	30%
`cli.ts` coverage	0%
`docker-manager.ts` coverage	18%
Smoke Gemini success rate	0%
Smoke Claude success rate	~50%
All other workflow success rates	~100%

The pipeline is mature with strong breadth (multi-language, multi-engine, multi-tier), but depth is lacking in core module unit testing, container image scanning, and automatic enforcement of passing CI before merge.

Generated by CI/CD Pipelines and Integration Tests Gap Assessment · 201.1 AIC · ⊞ 37.2K · ◷

expires on Jun 17, 2026, 6:12 AM UTC

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CI/CD Assessment] CI/CD Pipelines and Integration Tests Gap Assessment #4663

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

[CI/CD Assessment] CI/CD Pipelines and Integration Tests Gap Assessment #4663

Uh oh!

github-actions[bot] Bot Jun 10, 2026

📊 Current CI/CD Pipeline Status

✅ Existing Quality Gates

On Every PR

Scheduled / Post-Merge Only

🔍 Identified Gaps

🔴 High Priority

🟡 Medium Priority

🟢 Low Priority

📋 Actionable Recommendations

1. Raise coverage thresholds and add per-file minimums

2. Add container image scanning to PR pipeline

3. Add --block-domains and --env-all integration tests

4. Add docs site build check to PR pipeline

5. Add performance regression gate on PRs

6. Fix or formally disable Smoke Gemini

7. Enable branch protection rules with required checks

8. Add unskipped Docker warning stub tests

📈 Metrics Summary

Replies: 0 comments

github-actions[bot]
Bot Jun 10, 2026

3. Add `--block-domains` and `--env-all` integration tests