Skip to content

Aditya-creator173/IPI

Repository files navigation

IPIBench: A Cross-Model Indirect Prompt Injection Benchmark

IPIBench is an empirical benchmark for studying indirect prompt injection (IPI) in LLM-based agent workflows. It focuses on how models behave when they are asked to process external content that may contain malicious embedded instructions.

The repository is designed for reproducible evaluation: fixed attack scenarios, explicit success criteria, and a scriptable execution pipeline.

What Is Indirect Prompt Injection?

Indirect prompt injection happens when an attacker places instructions inside content the model is asked to read, such as:

  • webpages
  • retrieved documents
  • tool output
  • copied text blocks

If the model treats those embedded instructions as valid commands, it can deviate from the user goal without obvious signs.

Benchmark Scope

This benchmark includes:

  • 100 hand-crafted attack scenarios in benchmark.json
  • multiple attack goals and evasion styles per scenario
  • per-scenario success indicators for automatic scoring
  • defense-mode comparisons in the same evaluation run

The current runner script in this repository evaluates four defense modes:

  • none
  • prompt_warning
  • spotlighting
  • input_filter

Key Findings

Work-in-Progress

Repository Structure

Setup

1) Create a virtual environment

Windows PowerShell:

python -m venv .venv
.\.venv\Scripts\Activate.ps1

macOS/Linux:

python3 -m venv .venv
source .venv/bin/activate

2) Install dependencies

pip install -r requirements.txt

3) Configure API keys

The benchmark runner reads API keys from environment variables. Use .env.example as a local template and keep your real .env file private.

Minimum providers currently used by the script:

  • Groq
  • Google GenAI

The requirements file also includes Mistral and OpenAI packages for future extension.

Windows PowerShell (current session):

$env:GROQ_API_KEY="your_groq_key"
$env:GOOGLE_API_KEY="your_google_key"

macOS/Linux (current session):

export GROQ_API_KEY="your_groq_key"
export GOOGLE_API_KEY="your_google_key"

If you also run scripts under benchmark_scripts/, you may need:

  • ANTHROPIC_API_KEY
  • OPENROUTER_API_KEY
  • GITHUB_TOKEN

Security Before Open-Sourcing

Use this checklist before pushing to GitHub:

  1. Confirm only template config is committed: .env.example is tracked, but real .env is not.
  2. Run a quick tracked-file secret scan:
git grep -nE '(api[_-]?key|apikey|secret|token|password)[[:space:]]*[:=][[:space:]]*["\x27][^"\x27]{8,}["\x27]|sk-[A-Za-z0-9]{20,}|ghp_[A-Za-z0-9]{20,}|github_pat_[A-Za-z0-9_]{20,}|AIza[0-9A-Za-z_-]{20,}|-----BEGIN [A-Z ]+PRIVATE KEY-----' -- .
  1. If anything looks like a real credential, remove it and rotate that key before publishing.
  2. Review notebook outputs and CSV artifacts for copied credentials before commit.

How To Run

Dry run (small sanity check):

python run_benchmark.py --dry-run

Full run:

python run_benchmark.py

Output Files

Model scripts write outputs directly to:

To merge all per-model CSV files into one final table:

python merge_results.py

This produces results/results_final.csv.

Reproducibility Checklist

  • pin package versions before final reporting
  • document model names and API versions used
  • log run date and hardware/network context
  • keep the exact benchmark.json used for each reported result
  • keep raw per-model files in results/csv/ and results/jsonl/ for auditability

Dataset

The benchmark dataset can also be published on Hugging Face.

Placeholder:

huggingface.co/datasets/your-username/IPIBench

Citation

If you use this benchmark in research, cite this repository and add your formal citation after publication.

Example placeholder:

@misc{ipibench2026,
	title={IPIBench: A Cross-Model Indirect Prompt Injection Benchmark},
	author={Aditya L},
	year={2026},
	howpublished={GitHub repository}
}

Author

Aditya L

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors