Summary
AgentThreatBench is the first evaluation suite that operationalizes the OWASP Top 10 for Agentic Applications (2026) into executable tasks. It was merged into UKGovernmentBEIS/inspect_evals — the official evaluation suite of the UK AI Safety Institute (AISI).
What it measures
Three tasks targeting distinct agentic attack surfaces that are directly relevant to GPT-4o and o3 agent deployments:
| Task |
OWASP ID |
Attack |
| Memory Poison |
ASI06 |
Adversarial entries in agent memory/RAG store |
| Autonomy Hijack |
ASI01 |
Indirect prompt injection via email tool output |
| Data Exfiltration |
ASI01 |
Indirect injection → PII leak via send_message tool |
Why this belongs in openai/evals
- Measures agentic security — a gap in the current openai/evals suite
- Dual-metric scoring (utility + security) measures whether safety improvements come at the cost of capability
- Maintained by UK AISI — credible, independent benchmark
- Directly relevant to OpenAI's Preparedness Framework and safety commitments
Proposal
Add AgentThreatBench as a reference eval in the openai/evals registry, or reference it in the evals documentation as a recommended agentic security benchmark.
Benchmark docs: https://ukgovernmentbeis.github.io/inspect_evals/evals/safeguards/agent_threat_bench/
Source: https://github.com/UKGovernmentBEIS/inspect_evals/tree/main/src/inspect_evals/agent_threat_bench
OWASP reference: https://genai.owasp.org/resource/owasp-top-10-for-agentic-applications-for-2026/
Summary
AgentThreatBench is the first evaluation suite that operationalizes the OWASP Top 10 for Agentic Applications (2026) into executable tasks. It was merged into UKGovernmentBEIS/inspect_evals — the official evaluation suite of the UK AI Safety Institute (AISI).
What it measures
Three tasks targeting distinct agentic attack surfaces that are directly relevant to GPT-4o and o3 agent deployments:
send_messagetoolWhy this belongs in openai/evals
Proposal
Add AgentThreatBench as a reference eval in the openai/evals registry, or reference it in the evals documentation as a recommended agentic security benchmark.
Benchmark docs: https://ukgovernmentbeis.github.io/inspect_evals/evals/safeguards/agent_threat_bench/
Source: https://github.com/UKGovernmentBEIS/inspect_evals/tree/main/src/inspect_evals/agent_threat_bench
OWASP reference: https://genai.owasp.org/resource/owasp-top-10-for-agentic-applications-for-2026/