Skip to content

feat: v4.0.0 — Novel R&D + Tool Coverage Expansion#19

Merged
zakirkun merged 2 commits into
mainfrom
feat/v4-rd-roadmap
May 29, 2026
Merged

feat: v4.0.0 — Novel R&D + Tool Coverage Expansion#19
zakirkun merged 2 commits into
mainfrom
feat/v4-rd-roadmap

Conversation

@zakirkun

Copy link
Copy Markdown
Owner

Summary

v4.0.0 ships 14 items across 2 tracks: Novel AI/Agent R&D + Tool Coverage Expansion. 296 tests pass (+93% from v3 baseline of 153). All v3 hardening preserved.

Track A — AI / Agent R&D (7)

ID Item
A1 RAG knowledge base — SQLite + FTS5 + optional embeddings; analyst grounding
A2 Multi-agent debate triage — red/blue/judge over MEDIUM-fp findings
A3 Vision-LLM screenshot analysis — playwright + OpenAI/Claude vision
A4 Plugin contract + Ollama / OpenAI-compatible providers
A5 Learned tool selection (offline ranker + telemetry)
A6 Eval harness — 3 tiers (parser fixtures, workflow integration, agent grounding)
A7 Judge model routing for think_deeply — ~10x cost reduction

Track B — Tool Coverage Expansion (7)

ID Category Tools
B8 Active Directory crackmapexec, bloodhound, kerbrute, impacket-secretsdump
B9 Mobile Android mobsf, apkleaks, objection
B10 API fuzzers schemathesis, restler, cariddi
B11 SAST + secrets semgrep, trufflehog, dependency-check
B12 LLM red-team garak, pyrit, prompt_fuzz
B13 Burp/ZAP bridge zap, burp
B14 Output exporters SARIF v2.1.0, DefectDojo, Slack

Numbers

  • 50 tools registered (was 31)
  • 6 AI providers (was 4) — added Ollama + OpenAI-compatible
  • 296 tests pass (was 153)
  • 8 new workflow YAMLs
  • New CLI surfaces: guardian kb, guardian telemetry

Security posture

All v3 hardening preserved:

  • Prompt-injection delimiters on every external input (incl. KB references, vision-derived text)
  • API key scrubbing at log/report write time
  • DNS-resolve scope validation (closes SSRF-class bypass)
  • Atomic-checkpointed session JSON
  • Confirmation gate for active+ tools
  • Lazy tool loading (startup <500ms despite 50 tools)

Test plan

  • pytest tests/ evals/ — 296 pass
  • No regression on v3 baseline (153 tests still green)
  • guardian --help startup time <500ms
  • guardian kb seed && guardian kb query "log4j JNDI" — retrieval works
  • guardian telemetry export ./reports --out t.jsonl && guardian telemetry train t.jsonl — round-trip works
  • End-to-end manual: web_pentest_with_debate against authorized DVWA instance
  • End-to-end manual: web_visual_pentest with --provider openai
  • End-to-end manual: SARIF export against GitHub code scanning

Docs

  • CHANGELOG.md — full v4.0.0 entry
  • docs/V4_FEATURES.md — feature reference per item
  • docs/EVAL_GUIDE.md — eval harness usage
  • docs/PLUGIN_GUIDE.md — third-party provider/tool authoring
  • README.md — features section, latest updates, v4 quick-start
  • QUICKSTART.md — v4 quick-wins added

…ated security tools, AI providers, and workflow management, including Docker deployment.
Track A — AI/Agent R&D (7 items)
- A1 RAG knowledge base (SQLite + FTS5 + optional embeddings)
- A2 Multi-agent debate triage (red/blue/judge)
- A3 Vision-LLM screenshot analysis (playwright + visual triage)
- A4 Plugin contract + Ollama/OpenAI-compat providers
- A5 Learned tool selection (offline ranker + telemetry)
- A6 Eval harness (3 tiers: parser/workflow/agent)
- A7 Judge model upgrade for think_deeply

Track B — Tool Coverage Expansion (7 items)
- B8 Active Directory toolkit (4 tools)
- B9 Mobile Android toolkit (3 tools)
- B10 API-spec fuzzers (3 tools)
- B11 SAST + secrets at scale (3 tools)
- B12 LLM red-team toolkit (3 tools)
- B13 Burp/ZAP automation bridge (2 tools)
- B14 SARIF + DefectDojo + Slack exporters

50 tools registered. 6 AI providers. 296 tests pass (+93% from v3).
All v3 hardening preserved (key scrub, prompt-injection delimiters,
DNS-resolve scope, atomic checkpoints, lazy tool loading).

New CLI surfaces: guardian kb, guardian telemetry.
New workflows shipped: web_pentest_with_debate, web_visual_pentest,
ad_assessment, mobile_android, llm_redteam, sast_review, api_pentest_v2.
Copilot AI review requested due to automatic review settings May 29, 2026 03:16
@zakirkun zakirkun merged commit 68f79f8 into main May 29, 2026
1 of 6 checks passed

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants