Awesome AI Security

A curated list of awesome AI security related frameworks, standards, learning resources and open source tools.

If you want to contribute, create a PR or contact me @ottosulin.

Learning Resources

Reading & Guides

OWASP ML TOP 10
OWASP LLM TOP 10
OWASP AI Security and Privacy Guide
NIST AIRC - NIST Trustworthy & Responsible AI Resource Center
The MLSecOps Top 10 by Institute for Ethical AI & Machine Learning
OWASP Multi-Agentic System Threat Modeling
OWASP: CheatSheet – A Practical Guide for Securely Using Third-Party MCP Servers 1.0

Courses, Labs & CTFs

Damn Vulnerable MCP Server - A deliberately vulnerable implementation of the Model Context Protocol (MCP) for educational purposes.
OWASP WrongSecrets LLM exercise
vulnerable-mcp-servers-lab - A collection of servers which are deliberately vulnerable to learn Pentesting MCP Servers.
FinBot Agentic AI Capture The Flag (CTF) Application - FinBot is an Agentic Security Capture The Flag (CTF) interactive platform that simulates real-world vulnerabilities in agentic AI systems using a simulated Financial Services-focused application.
AI-Red-Teaming-Playground-Labs - AI Red Teaming playground labs to run AI Red Teaming trainings including infrastructure.
Damn Vulnerable LLM Agent - Intentionally vulnerable LLM agent for learning about prompt injection, tool misuse, and agent security
AI GOAT - Damn Vulnerable LLM application with 10 challenges covering OWASP LLM Top 10 risks. Educational CTF-style lab.

Podcasts

MLSecOps podcast
AI Security Podcast
AI Security Ops - Weekly podcasts from Black Hills Information Security exploring how AI transforms cybersecurity—covering emerging threats, tools, and trends with practical, actionable knowledge.
GenAI Security podcast

Governance & Risk Management

Frameworks

Standards & Verification

Taxonomies, Terminology & Risk Databases

NIST AI 100-2e2023 - Adversarial machine learning taxonomy and terminology
MITRE ATLAS
AVIDML
MIT AI Risk Repository
AI Incident Database
ISO/IEC 22989:2022 Information technology — Artificial intelligence — Artificial intelligence concepts and terminology
NIST AI Glossary
The Arcanum Prompt Injection Taxonomy - Comprehensive prompt injection attack classification system covering attack intents, techniques, evasions, and input vectors. Categorizes goals, methods, obfuscation techniques, and attack surfaces for prompt injection attacks.
CSA LLM Threats Taxonomy

Checklists & Practical Guidance

Attack Techniques & Red Teaming

Adversarial ML & Classical Models

Adversarial Robustness Toolkit - ART focuses on the threats of Evasion (change the model behavior with input modifications), Poisoning (control a model with training data modifications), Extraction (steal a model through queries) and Inference (attack the privacy of the training data)
cleverhans - An adversarial example library for constructing attacks, building defenses, and benchmarking both
foolbox - A Python toolbox to create adversarial examples that fool neural networks in PyTorch, TensorFlow, and JAX
TextAttack - A Python framework for adversarial attacks, data augmentation, and model training in NLP
DeepFool - A simple and accurate method to fool deep neural networks
Adversarial Machine Learning Library (Ad-lib) - Game-theoretic adversarial machine learning library providing a set of learner and adversary modules
Exploring the Space of Adversarial Images
Deep-pwning - a lightweight framework for experimenting with machine learning models with the goal of evaluating their robustness against a motivated adversary
Counterfit - generic automation layer for assessing the security of machine learning systems
Malware Env for OpenAI Gym - makes it possible to write agents that learn to manipulate PE files (e.g., malware) to achieve some objective (e.g., bypass AV) based on a reward provided by taking specific manipulation actions
Charcuterie - code execution techniques for ML or ML adjacent libraries
OffsecML Playbook - A collection of offensive and adversarial TTPs with proofs of concept
BadDiffusion - Official repo to reproduce the paper "How to Backdoor Diffusion Models?" published at CVPR 2023
Snaike-MLFlow - MLflow red team toolsuite
secml-torch - SecML-Torch: A Library for Robustness Evaluation of Deep Learning Models
awesome-ai-safety

LLM & GenAI Red Teaming

garak - security probing tool for LLMs
ai-scanner - Open-source web application for AI model security assessments, built on NVIDIA garak. Features 179 probes, multi-target scanning, scheduled scans, ASR scoring, and SIEM integration.
promptfoo - Test your prompts, agents, and RAGs. Red teaming, pentesting, and vulnerability scanning for LLMs. Compare performance of GPT, Claude, Gemini, Llama, and more. Simple declarative configs with command line and CI/CD integration.
PyRIT - The Python Risk Identification Tool for generative AI (PyRIT) is an open source framework built to empower security professionals and engineers to proactively identify risks in generative AI systems.
PurpleLlama - Set of tools to assess and improve LLM security.
augustus - LLM security testing framework for detecting prompt injection, jailbreaks, and adversarial attacks. 190+ probes, 28 providers, single Go binary. Production-ready with concurrent scanning, rate limiting, and retry logic.
FuzzyAI - A powerful tool for automated LLM fuzzing. It is designed to help developers and security researchers identify and mitigate potential jailbreaks in their LLM APIs.
LLMFuzzer - LLMFuzzer is the first open-source fuzzing framework specifically designed for Large Language Models (LLMs), especially for their integrations in applications via LLM APIs.
EasyJailbreak - An easy-to-use Python framework to generate adversarial jailbreak prompts.
gptfuzz - Official repo for GPTFUZZER: Red Teaming Large Language Models with Auto-Generated Jailbreak Prompts
llamator - Framework for testing vulnerabilities of large language models (LLM).
agentic_security - Agentic LLM Vulnerability Scanner / AI red teaming kit
Agentic Radar - Open-source CLI security scanner for agentic workflows.
whistleblower - Offensive security tool for testing against system prompt leakage and capability discovery of an AI application exposed through API
promptmap - a prompt injection scanner for custom LLM applications
HouYi - The automated prompt injection framework for LLM-integrated applications.
spikee - Simple Prompt Injection Kit for Evaluation and Exploitation
ps-fuzz - Make your GenAI Apps Safe & Secure — Test & harden your system prompt
EasyEdit - Modify an LLM's ground truths
llm-attacks - Universal and Transferable Attacks on Aligned Language Models
Dropbox llm-security - Dropbox LLM Security research code and results
llm-security - New ways of breaking app-integrated LLMs
Plexiglass - A toolkit for detecting and protecting against vulnerabilities in Large Language Models (LLMs).
Prompt Hacking Resources - A list of curated resources for people interested in AI Red Teaming, Jailbreaking, and Prompt Injection
Giskard - Open-Source Evaluation & Testing for AI & LLM systems
blackice - BlackIce is an open-source containerized toolkit designed for red teaming AI models, including Large Language Models (LLMs) and classical machine learning (ML) models. Inspired by the convenience and standardization of Kali Linux in traditional penetration testing, BlackIce simplifies AI security assessments by providing a reproducible container image preconfigured with specialized evaluation tools.
vigil-llm - Detect prompt injections, jailbreaks, and other potentially risky Large Language Model (LLM) inputs
ai-best-practices - Semgrep Pro Rules to ensure code using LLMs is following best practices. 58 rules, 102 sub-rules covering 6 providers + MCP + Claude Code & Cursor hooks + LangChain. Detects hardcoded API keys, prompt injection risks, missing safety checks, and unhandled errors across 7 languages.
G0DM0D3 - Open-source multi-model chat interface for red teaming with 50+ models via OpenRouter. Features GODMODE CLASSIC (5 jailbreak combos), ULTRAPLINIAN multi-model evaluation, Parseltongue input perturbation engine with 33 red team techniques, and AutoTune adaptive sampling for AI safety research.
claude-secure-coding-rules - Open-source security rules that guide Claude Code to generate secure code by default.

Agentic AI & MCP Attack Tools

mcp-injection-experiments - Code snippets to reproduce MCP tool poisoning attacks.
AI-Infra-Guard - A comprehensive, intelligent, and easy-to-use AI Red Teaming platform developed by Tencent Zhuque Lab. Integrates modules for Infra Scan, MCP Scan, and Jailbreak Evaluation, providing a one-click web UI, REST APIs, and Docker-based deployment for comprehensive AI security evaluation.
OpenPromptInjection - A benchmark for prompt injection attacks and defenses

AI-Assisted Offensive Security

PentestGPT - A GPT-empowered penetration testing tool
HackingBuddyGPT - Helping Ethical Hackers use LLMs in 50 Lines of Code or less
cai - Cybersecurity AI (CAI), an open Bug Bounty-ready Artificial Intelligence (paper)
shannon - Fully autonomous AI pentester for web apps and APIs by Keygraph. White-box security testing that analyzes source code, identifies attack vectors, and executes real exploits. 96.15% success rate (100/104 exploits) on XBOW benchmark.
strix - Strix are autonomous AI agents that act just like real hackers - they run your code dynamically, find vulnerabilities, and validate them through actual proof-of-concepts
redamon - AI-powered agentic red team framework that automates offensive security operations from reconnaissance to exploitation to post-exploitation with zero human intervention.
CyberStrikeAI - AI-native security testing platform built in Go. Integrates 100+ security tools with an intelligent orchestration engine, role-based testing with predefined security roles, skills system, and comprehensive lifecycle management. Uses MCP protocol and AI agents for end-to-end automation from conversational commands to vulnerability discovery.
HexStrikeAI - HexStrike AI MCP Agents is an advanced MCP server that lets AI agents (Claude, GPT, Copilot, etc.) autonomously run 150+ cybersecurity tools for automated pentesting, vulnerability discovery, bug bounty automation, and security research.
mcp-for-security - A collection of Model Context Protocol servers for popular security tools like SQLMap, FFUF, NMAP, Masscan and more. Integrate security testing and penetration testing into AI workflows.
mcp-security-hub - A growing collection of MCP servers bringing offensive security tools to AI assistants. Nmap, Ghidra, Nuclei, SQLMap, Hashcat and more.
Burp MCP Server - MCP Server for Burp
burpgpt - A Burp Suite extension that integrates OpenAI's GPT to perform an additional passive scan for discovering highly bespoke vulnerabilities and enables running traffic-based analysis of any type.
guardian-cli - AI-Powered Security Testing & Vulnerability Scanner. Guardian CLI is an intelligent security testing tool that leverages AI to automate penetration testing, vulnerability assessment, and security auditing.
AutoPentestX - AutoPentestX – Linux Automated Pentesting & Vulnerability Reporting Tool
HackGPT - A tool using ChatGPT for hacking

Benchmarks & Evaluations

ISC-Bench - Internal Safety Collapse: jailbreaks any frontier LLM (Claude Opus 4.6, GPT-5.4) in pass@3 via normal task completion — no adversarial prompting. Black-box, cross-domain, cross-science. Novel failure mode. [Paper]
jailbreakbench - JailbreakBench: An Open Robustness Benchmark for Jailbreaking Language Models [NeurIPS 2024 Datasets and Benchmarks Track]
AgentDojo - A Dynamic Environment to Evaluate Attacks and Defenses for LLM Agents.
AIRTBench - Code Repository for: AIRTBench: Measuring Autonomous AI Red Teaming Capabilities in Language Models
HackingBuddyGPT benchmark dataset - Benchmark dataset for automated pentesting
AgentDoG - AgentDoG is a risk-aware evaluation and guarding framework for autonomous agents. It focuses on trajectory-level risk assessment, aiming to determine whether an agent's execution trajectory contains safety risks under diverse application scenarios.

Defense & Security Controls

Input/Output Guardrails

NeMo-Guardrails - NeMo Guardrails is an open-source toolkit for easily adding programmable guardrails to LLM-based conversational systems.
LlamaFirewall - LlamaFirewall is a framework designed to detect and mitigate AI centric security risks, supporting multiple layers of inputs and outputs, such as typical LLM chat and more advanced multi-step agentic operations.
llm-guard - LLM Guard by Protect AI is a comprehensive tool designed to fortify the security of Large Language Models (LLMs).
Guardrails.ai - Guardrails is a Python package that lets a user add structure, type and quality guarantees to the outputs of large language models (LLMs)
TrustGate - Generative Application Firewall (GAF) to detect, prevent and block attacks against GenAI Applications
ZenGuard AI - The fastest Trust Layer for AI Agents
vibraniumdome - Full blown, end to end LLM WAF for Agents, allowing security teams governance, auditing, policy driven control over Agents usage of language models.
LocalMod - Self-hosted content moderation API with prompt injection detection, toxicity filtering, PII detection, and NSFW classification. Runs 100% offline.
DynaGuard - A Dynamic Guardrail Model With User-Defined Policies
AprielGuard - 8B parameter safety–security safeguard model
Safe Zone - Safe Zone is an open-source PII detection and guardrails engine that prevents sensitive data from leaking to LLMs and third-party APIs.
superagent - Superagent provides purpose-trained guardrails that make AI-agents secure and compliant.
ShellWard - AI Agent Security Middleware with 8-layer defense against prompt injection, data exfiltration & dangerous commands. Zero dependencies.
rebuff - Prompt Injection Detector
langkit - LangKit is an open-source text metrics toolkit for monitoring language models. The toolkit provides various security related metrics that can be used to detect attacks
CodeGate - An open-source, privacy-focused project that acts as a layer of security within a developer's Code Generation AI workflow

Agent Runtime Security & Sandboxing

OpenShell - _OpenShell is the safe, private runtime for autonomous AI agents. It provides sandboxed execution environments governed by declarative YAML policies that prevent unauthorized file access, data exfiltration, and uncontrolled network activity.
OpenSandbox - Secure, Fast, and Extensible Sandbox runtime for AI agents. Multi-language SDKs, Docker/Kubernetes runtimes, gVisor/Kata Containers/Firecracker isolation. CNCF Landscape project.
Aegis - Open-source EDR for AI agents by Antropos. Monitor processes, files, network, and behavior of autonomous AI agents in real time. No telemetry, no cloud, everything stays local.
Microsoft Agent Governance Toolkit - AI Agent Governance Toolkit from Microsoft — Policy enforcement, zero-trust identity, execution sandboxing, and reliability engineering for autonomous AI agents. Covers 10/10 OWASP Agentic Top 10.
agentfield - Open-source control plane for agent systems with cryptographic identity, policy enforcement, and audit-friendly observability.
leash - Leash wraps AI coding agents in containers and monitors their activity.
vibekit - Run Claude Code, Gemini, Codex — or any coding agent — in a clean, isolated sandbox with sensitive data redaction and observability baked in.
pipelock - Security harness for AI agents — egress proxy with DLP scanning, SSRF protection, MCP response scanning, and workspace integrity monitoring
skill-scanner - A security scanner for AI Agent Skills that detects prompt injection, data exfiltration, and malicious code patterns. Combines pattern-based detection (YAML + YARA), LLM-as-a-judge, and behavioral dataflow analysis for comprehensive threat detection.
Project CodeGuard - CoSAI Open Source Project for securing AI-assisted development workflows. CodeGuard provides security controls and guardrails for AI coding assistants to prevent vulnerabilities from being introduced during AI-generated code development.
AgentLens - Agent observability and replay tooling for AI safety & interpretability research. Harness for running multi-session agent trajectories, capturing them in ATIF format, and tracking file state changes across sessions. Built for studying LLM agent behavior across multi-turn, multi-session, multi-agent interactions.
claude-code-devcontainer - Sandboxed devcontainer for running Claude Code in bypass mode safely. Built for security audits and untrusted code review.
claude-code-safety-net - A Claude Code plugin that acts as a safety net, catching destructive git and filesystem commands before they execute
OneCLI - Open-source credential vault for AI agents. Rust HTTP gateway intercepts agent requests and injects API credentials transparently so agents never hold raw keys. AES-256-GCM encryption, per-agent scoping, full audit trail.
OpenSandbox - Secure, Fast, and Extensible Sandbox runtime for AI agents. Multi-language SDKs, Docker/Kubernetes runtimes, gVisor/Kata Containers/Firecracker isolation. CNCF Landscape project.
openclaw-shield - Security plugin for OpenClaw agents - prevents secret leaks, PII exposure, and destructive command execution
clawsec - Security scanner and hardening tool for OpenClaw deployments. Provides security assessments, configuration auditing, and vulnerability detection specifically for OpenClaw gateway and agent configurations.
nanoclaw - Lightweight alternative to OpenClaw that runs in containers for security. Connects to WhatsApp, has memory, scheduled jobs, and runs directly on Anthropic's Agents SDK. First AI assistant to support Agent Swarms for collaborative agent teams.
secureclaw - Automated security hardening for OpenClaw AI agents by Adversa AI. 51 audit checks, 12 behavioral rules, 9 scripts, 4 pattern databases. Full OWASP ASI Top 10 coverage. Protects against prompt injection, credential theft, supply chain attacks, and privacy leaks.
defenseclaw - Enterprise governance layer for OpenClaw from Cisco AI Defense. Scans skills, MCP servers, and plugins with built-in CodeGuard SAST, tool call inspection engine, LLM guardrail proxy, and SIEM integration. Auto-blocks HIGH/CRITICAL findings.

MCP Security

MCP-Security-Checklist - A comprehensive security checklist for MCP-based AI tools. Built by SlowMist to safeguard LLM plugin ecosystems.
Awesome-MCP-Security - Everything you need to know about Model Context Protocol (MCP) security.
secure-mcp-gateway - This Secure MCP Gateway is built with authentication, automatic tool discovery, caching, and guardrail enforcement.
mcp-context-protector - context-protector is a security wrapper for MCP servers that addresses risks associated with running untrusted MCP servers, including line jumping, unexpected server configuration changes, and other prompt injection attacks
mcp-guardian - MCP Guardian manages your LLM assistant's access to MCP servers, handing you realtime control of your LLM's activity.
MCP Audit VSCode Extension - Audit and log all GitHub Copilot MCP tool calls in VSCode centrally with ease.
MCP-Scan - A security scanning tool for MCP servers

Model & Artifact Scanning

modelscan - ModelScan is an open source project from Protect AI that scans models to determine if they contain unsafe code.
picklescan - Security scanner detecting Python Pickle files performing suspicious actions
fickling - A Python pickling decompiler and static analyzer
medusa - AI-first security scanner with 74+ analyzers, 180+ AI agent security rules, intelligent false positive reduction. Supports all languages. CVE detection for React2Shell, mcp-remote RCE.
julius - LLM service fingerprinting tool for security professionals. Detects 32+ AI services (Ollama, vLLM, LiteLLM, Hugging Face TGI, etc.) during penetration tests and attack surface discovery. Uses HTTP-based service fingerprinting to identify server infrastructure.
a2a-scanner - Scan A2A agents for potential threats and security issues

AI-Assisted Defensive Security

Claude Code Security Review - An AI-powered security review GitHub Action using Claude to analyze code changes for security vulnerabilities.
GhidraGPT - Integrates GPT models into Ghidra for automated code analysis, variable renaming, vulnerability detection, and explanation generation.
IDAssist - AI-Powered Reverse Engineering Plugin for IDA Pro. Integrates LLM-powered analysis into IDA's interface with semantic knowledge graphs, RAG document search, and support for multiple LLM providers (OpenAI, Anthropic, Ollama, LiteLLM). Analyzes functions, suggests renames, answers questions about code.
ThreatForest - Agentic threat modeling platform built on Strands framework. Autonomously generates attack trees from repositories, maps attack steps to MITRE ATT&CK techniques, and produces actionable mitigation recommendations.
claude-grc-plugin - Claude Code plugin that turns Claude into a senior GRC analyst. 72+ reference files covering 15 frameworks (NIST 800-53, FedRAMP, ISO 27001, SOC 2, etc.), 24 slash commands, and deep domain knowledge for federal and commercial compliance work.
Vigil SOC - A comprehensive open-source security operations platform for AI agents, enabling real-time monitoring, threat detection, and incident response for AI-powered environments.

Privacy & Confidential Computing

Python Differential Privacy Library
Diffprivlib - The IBM Differential Privacy Library
TenSEAL - A library for doing homomorphic encryption operations on tensors
SyMPC - A Secure Multiparty Computation companion library for Syft
PyVertical - Privacy Preserving Vertical Federated Learning
Cloaked AI - Open source property-preserving encryption for vector embeddings
dstack - Open-source confidential AI framework for secure ML/LLM deployment with hardware-enforced isolation and data privacy
PrivacyRaven - privacy testing library for deep learning systems
PLOT4ai - Privacy Library Of Threats 4 Artificial Intelligence — A threat modeling library to help you build responsible AI

Data & Supply Chain Security

datasig - Dataset fingerprinting for AIBOM
OWASP AIBOM - AI Bill of Materials
Trusera ai-bom - AI Bill of Materials — discover every AI agent, model, and API in your infrastructure

Agentic AI Security Skills

Elastic Agent Skills - Collection of skills for Elastic's AI assistant, enabling natural language security investigations across logs, traces, and threat intelligence
Ghost Security Skills - Agent application security (appsec) skills and tools for Claude Code
tm_skills - Agent skills to help with Continuous Threat Modeling
Trail of Bits Skills Marketplace - Trail of Bits Claude Code skills for security research, vulnerability detection, and audit workflows
Semgrep Skills - Official Semgrep skills for Claude Code and other AI coding assistants. Provides security scanning, code analysis, and vulnerability detection capabilities directly in your AI-assisted development workflow.
claude-bug-bounty - Claude Code skill for AI-assisted bug bounty hunting. Automates reconnaissance, IDOR, XSS, SSRF, OAuth, GraphQL, and LLM injection testing with 4-gate validation checklist and report generation.
Anthropic Cybersecurity Skills - 734+ structured cybersecurity skills for AI agents. MITRE ATT&CK mapped, agentskills.io standard. Compatible with Claude Code, Copilot, Codex CLI, Cursor, and Gemini CLI.
llm-sast-scanner - SAST skill for AI coding agents (Claude Code, Codex, etc.) with structured vulnerability detection across 34 classes. Features source-to-sink taint analysis, Judge verification for false positive reduction, and 99%+ precision/recall on benchmarks.
sast-skills - Collection of agent skills that turn your AI coder into a SAST scanner

Security-Focused AI Models

VulnLLM-R-7B - Specialized reasoning LLM for vulnerability detection. Uses Chain-of-Thought reasoning to analyze data flow, control flow, and security context. Outperforms Claude-3.7-Sonnet and CodeQL on vulnerability detection benchmarks. Only 7B parameters making it efficient and fast.
Foundation-Sec-8B-Reasoning - Llama-3.1-FoundationAI-SecurityLLM-8B-Reasoning is an open-weight, 8-billion parameter instruction-tuned language model specialized for cybersecurity applications. It extends the Foundation-Sec-8B base model with instruction-following and reasoning capabilities.

Safety Classifiers & Prompt Injection Detection

Llama-Guard-4-12B - Meta's latest multimodal safety classifier for detecting harmful content in LLM inputs and outputs across text and image modalities.
Llama-Prompt-Guard-2-86M - Lightweight 86M parameter model from Meta for detecting prompt injection and jailbreak attempts in production LLM pipelines.
ShieldGemma-2B - Google's 2B parameter text safety classifier for detecting harmful content, built on the Gemma architecture.
DeBERTa Prompt Injection Detector v2 - Protect AI's DeBERTa-v3-base fine-tuned for prompt injection detection, widely used in production LLM guardrail pipelines.
Prompt Injection Sentinel - ModernBERT-large model fine-tuned for prompt injection and jailbreak classification with low false-positive rate.

Datasets

SafetyPrompts - Curated collection of safety-relevant prompts for evaluating LLM safety and security properties.
Do-Not-Answer - Dataset of prompts that responsible LLMs should not answer, for safety evaluation and red teaming.
JailBreakV-28K - Large-scale dataset of 28,000 jailbreak prompts for benchmarking LLM safety.
Leaked System Prompts - Collection of leaked system prompts from commercial AI tools — useful for understanding real-world prompt engineering and attack surfaces.

Name		Name	Last commit message	Last commit date
Latest commit History 251 Commits
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Awesome AI Security

Table of Contents

Learning Resources

Reading & Guides

Courses, Labs & CTFs

Podcasts

Governance & Risk Management

Frameworks

Standards & Verification

Taxonomies, Terminology & Risk Databases

Checklists & Practical Guidance

Attack Techniques & Red Teaming

Adversarial ML & Classical Models

LLM & GenAI Red Teaming

Agentic AI & MCP Attack Tools

AI-Assisted Offensive Security

Benchmarks & Evaluations

Defense & Security Controls

Input/Output Guardrails

Agent Runtime Security & Sandboxing

MCP Security

Model & Artifact Scanning

AI-Assisted Defensive Security

Privacy & Confidential Computing

Data & Supply Chain Security

Agentic AI Security Skills

Security-Focused AI Models

Safety Classifiers & Prompt Injection Detection

Datasets

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Awesome AI Security

Table of Contents

Learning Resources

Reading & Guides

Courses, Labs & CTFs

Podcasts

Governance & Risk Management

Frameworks

Standards & Verification

Taxonomies, Terminology & Risk Databases

Checklists & Practical Guidance

Attack Techniques & Red Teaming

Adversarial ML & Classical Models

LLM & GenAI Red Teaming

Agentic AI & MCP Attack Tools

AI-Assisted Offensive Security

Benchmarks & Evaluations

Defense & Security Controls

Input/Output Guardrails

Agent Runtime Security & Sandboxing

MCP Security

Model & Artifact Scanning

AI-Assisted Defensive Security

Privacy & Confidential Computing

Data & Supply Chain Security

Agentic AI Security Skills

Security-Focused AI Models

Safety Classifiers & Prompt Injection Detection

Datasets

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages