Skip to content

Add AgentThreatBench: OWASP Agentic Top 10 security evaluation suite (UK AISI) #1667

@vgudur-dev

Description

@vgudur-dev

Summary

AgentThreatBench is the first evaluation suite that operationalizes the OWASP Top 10 for Agentic Applications (2026) into executable tasks. It was merged into UKGovernmentBEIS/inspect_evals — the official evaluation suite of the UK AI Safety Institute (AISI).

What it measures

Three tasks targeting distinct agentic attack surfaces that are directly relevant to GPT-4o and o3 agent deployments:

Task OWASP ID Attack
Memory Poison ASI06 Adversarial entries in agent memory/RAG store
Autonomy Hijack ASI01 Indirect prompt injection via email tool output
Data Exfiltration ASI01 Indirect injection → PII leak via send_message tool

Why this belongs in openai/evals

  • Measures agentic security — a gap in the current openai/evals suite
  • Dual-metric scoring (utility + security) measures whether safety improvements come at the cost of capability
  • Maintained by UK AISI — credible, independent benchmark
  • Directly relevant to OpenAI's Preparedness Framework and safety commitments

Proposal

Add AgentThreatBench as a reference eval in the openai/evals registry, or reference it in the evals documentation as a recommended agentic security benchmark.

Benchmark docs: https://ukgovernmentbeis.github.io/inspect_evals/evals/safeguards/agent_threat_bench/
Source: https://github.com/UKGovernmentBEIS/inspect_evals/tree/main/src/inspect_evals/agent_threat_bench
OWASP reference: https://genai.owasp.org/resource/owasp-top-10-for-agentic-applications-for-2026/

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions