Release Authority for AI Agents
AgentGate is a CI/CD-style release gate for AI agents. It analyzes trace evidence, blocks risky tool behavior, generates audit reports, and turns failures into controls for the next version.
AI agents are no longer just answering questions. They call tools, route user intent, read internal systems, and can trigger high-risk operational workflows.
Many teams still ship agent changes as if they were simple prompt edits: change the prompt, swap the model, add a tool, and hope nothing dangerous happens. That creates a release gap. A candidate version may improve normal answers while quietly regressing a dangerous capability, such as allowing the wrong role to reach a critical tool.
AgentGate brings software-style release control to AI agents. Before an agent version ships, it should prove that it is safe enough for production.
AgentGate reads Phoenix traces, evals, and annotations, applies release policy, detects risky tool behavior, and returns a clear release decision: APPROVED or BLOCKED.
In the reference demo:
| Agent version | Release decision | Why |
|---|---|---|
| v2 | BLOCKED | A policy-denied dangerous action still appears in the evidence. |
| v2.1 | APPROVED | The blocking issue was fixed and inherited release controls pass. The report can still surface non-blocking follow-up. |
The key idea is simple: failures should not just be observed. They should become release requirements for the next version.
flowchart TD
A["Production agent<br/>UX, RAG, routing, tools, side effects"]
B["OpenTelemetry / OpenInference spans"]
C["Arize Phoenix<br/>traces, evals, annotations"]
D["AgentGate release check<br/>Phoenix MCP or local JSONL"]
E["Metrics + effective policy<br/>deterministic gate"]
F["Dangerous evidence classifier<br/>high-risk sessions"]
G["Gemini diagnosis<br/>optional explanation only"]
H["Release decision<br/>APPROVED or BLOCKED"]
I["Audit bundle<br/>JSON artifacts + HTML report"]
J["Future controls<br/>regression gates and dataset candidates"]
A --> B --> C --> D --> E --> H --> I --> J
D --> F --> G --> I
The final decision is deterministic:
metrics_summary.json + effective policy thresholds -> APPROVED or BLOCKED
Review agents may explain, recommend, and plan, but they do not decide release outcomes.
AgentGate is an external safety layer. Production agents stay in their own repositories and emit trace evidence. AgentGate reads that evidence, applies policy, and writes an auditable release verdict.
| Component | Role in AgentGate |
|---|---|
| Arize Phoenix | Primary evidence backend for traces, spans, evals, eval labels, and annotations. |
| OpenTelemetry | Captures agent behavior, tool calls, and execution traces. |
| Model Context Protocol (MCP) | Evidence access layer for Phoenix span and trace retrieval. |
| Google ADK | Powers AgentGate's review agents for investigation and planning within the Google Cloud Agent Builder ecosystem. |
| Vertex AI Gemini | Supports risk explanation, pattern analysis, and follow-up planning. |
Current review agents:
| Review agent | Purpose |
|---|---|
| Pattern Finder | Identifies recurring safety failure patterns from trace evidence. |
| Dataset Planner | Proposes follow-up dataset items and future release-control candidates. |
Requirements: Python 3.11+ and uv.
uv sync
uv run pytest
uv run agentgate configs validateRun the local reference gate without Phoenix:
uv run agentgate release check --source local \
--evidence configs/agents/stability_ops/seed/v2_evidence.jsonl \
--output-dir artifacts/release/reference-v2
uv run agentgate release check --source local \
--evidence configs/agents/stability_ops/seed/v21_evidence.jsonl \
--output-dir artifacts/release/reference-v21Start the dashboard:
uv run uvicorn backend.agentgate.main:app --reloadOpen http://127.0.0.1:8000/.
For Phoenix-backed release checks, configure .env.example and follow docs/getting-started/RELEASE_PIPELINE.md.
AgentGate is agent-agnostic because agent-specific configuration lives in an AgentPack.
cp -r configs/agents/_template configs/agents/my_agent
export AGENTGATE_AGENT_PACK=configs/agents/my_agent
uv run agentgate configs validate --agent-pack configs/agents/my_agentThen emit spans according to the trace contract and run:
uv run agentgate release check --source phoenix \
--agent-version <candidate> \
--output-dir artifacts/release/<candidate>Integration guide: docs/integration/CONNECT_YOUR_AGENT.md
Every release check writes an offline audit bundle:
release_decision.json- deterministic release verdictmetrics_summary.json- gate metrics and provenancedangerous_sessions.json- selected high-risk sessionsregression_gates.json- future control candidatesaudit_manifest.json- artifact hashes and reproducibility reciperelease_report.html- offline release certificate and evidence dossier
Details: docs/integration/RELEASE_OUTPUT.md
Sample artifacts: examples/artifacts/
| Goal | Start here |
|---|---|
| Understand the product and demo | docs/PRD_PRODUCT.md |
| Watch the demo workflow | docs/REFERENCE_WORKFLOW.md |
| Connect another agent | docs/integration/README.md |
| Inspect architecture | docs/ARCHITECTURE.md |
| Run the full pipeline | docs/getting-started/RELEASE_PIPELINE.md |
Full documentation index: docs/README.md
AgentGate is distributed under the terms of the Apache License 2.0. See LICENSE for details.

