Composable second-agent review for agentsys — scope creep detection, hallucination flagging, safety gates
A lightweight, self-hosted judge protocol that routes agent output through a configurable review before finalizing. No cloud platform required.
Before an agent's output reaches the user (or triggers irreversible action), agent-judge runs it through a specialized judge that evaluates:
- Scope creep: Did the agent do more than asked? Did it modify files outside its scope?
- Hallucination: Does the output contain claims not grounded in the provided context?
- Reversibility: Is the action reversible? Should it require explicit approval?
- Safety: Does the output contain credentials, PII, or dangerous shell commands?
/judge- Run judge review on current agent output or a specific artifact/judge-config- Configure judge thresholds and categories for this project
npm install -g agentsys
agentsys # select agent-judge from the marketplaceOr install directly:
agentsys install agent-judge/judge --category scope-creep --threshold warn
/judge --input path/to/diff.txt --category all
/judge-config --block-on safety --warn-on scope-creep,hallucination
Each judge run produces a verdict: PASS | FLAG | BLOCK
- PASS: Output is clean, proceed
- FLAG: Issue detected but not blocking - annotate and continue with warning
- BLOCK: Critical issue - stop and require explicit human approval
Verdicts include a structured rationale explaining exactly what triggered the verdict.
{
"hooks": {
"PostToolUse": [
{
"matcher": "Edit|Write|MultiEdit",
"hooks": [{ "type": "command", "command": "agentsys judge --threshold warn --category scope-creep,safety" }]
}
]
}
}agentsys judge --input "$(git diff main...HEAD)" --task "$TASK_DESCRIPTION" --threshold blockSee JUDGE.md for the full JUDGE protocol v1.0 specification.
- Part of the agentsys ecosystem
- https://agentskills.io