Add AgentThreatBench: OWASP Agentic Top 10 security evaluation suite (UK AISI)

## Summary

[AgentThreatBench](https://ukgovernmentbeis.github.io/inspect_evals/evals/safeguards/agent_threat_bench/) is the first evaluation suite that operationalizes the **OWASP Top 10 for Agentic Applications (2026)** into executable tasks. It was merged into [UKGovernmentBEIS/inspect_evals](https://github.com/UKGovernmentBEIS/inspect_evals/pull/1037) — the official evaluation suite of the **UK AI Safety Institute (AISI)**.

## What it measures

Three tasks targeting distinct agentic attack surfaces that are directly relevant to GPT-4o and o3 agent deployments:

| Task | OWASP ID | Attack |
|---|---|---|
| Memory Poison | ASI06 | Adversarial entries in agent memory/RAG store |
| Autonomy Hijack | ASI01 | Indirect prompt injection via email tool output |
| Data Exfiltration | ASI01 | Indirect injection → PII leak via `send_message` tool |

## Why this belongs in openai/evals

- Measures **agentic security** — a gap in the current openai/evals suite
- Dual-metric scoring (utility + security) measures whether safety improvements come at the cost of capability
- Maintained by UK AISI — credible, independent benchmark
- Directly relevant to OpenAI's Preparedness Framework and safety commitments

## Proposal

Add AgentThreatBench as a reference eval in the openai/evals registry, or reference it in the evals documentation as a recommended agentic security benchmark.

**Benchmark docs:** https://ukgovernmentbeis.github.io/inspect_evals/evals/safeguards/agent_threat_bench/
**Source:** https://github.com/UKGovernmentBEIS/inspect_evals/tree/main/src/inspect_evals/agent_threat_bench
**OWASP reference:** https://genai.owasp.org/resource/owasp-top-10-for-agentic-applications-for-2026/


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add AgentThreatBench: OWASP Agentic Top 10 security evaluation suite (UK AISI) #1667

Summary

What it measures

Why this belongs in openai/evals

Proposal

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Task	OWASP ID	Attack
Memory Poison	ASI06	Adversarial entries in agent memory/RAG store
Autonomy Hijack	ASI01	Indirect prompt injection via email tool output
Data Exfiltration	ASI01	Indirect injection → PII leak via `send_message` tool

Add AgentThreatBench: OWASP Agentic Top 10 security evaluation suite (UK AISI) #1667

Description

Summary

What it measures

Why this belongs in openai/evals

Proposal

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions