feat: add safety guardrails demo with Sentinel AI by MaxwellCalkin · Pull Request #57 · anthropics/claude-agent-sdk-demos

MaxwellCalkin · 2026-03-08T21:43:24Z

Summary

Adds a new safety-guardrails demo that shows how to integrate real-time AI safety scanning into Claude Agent SDK applications using Sentinel AI.

Uses SDK hooks (PreToolUse / PostToolUse) to scan all user inputs and agent outputs in real-time
Detects prompt injection, PII leakage, harmful content, toxicity, and hallucination indicators
Automatically blocks high-risk inputs and redacts PII before it reaches the agent
Follows the same project structure as the existing research-agent demo (Python, pyproject.toml, uv sync)

How it works

User Input ──> [Sentinel Scan] ──> Claude Agent ──> [Sentinel Scan] ──> Output
                    │                                      │
              Block injections                      Block harmful content
              Redact PII                            Detect PII leakage

Files added

safety-guardrails/
├── README.md                          # Setup instructions and usage guide
├── pyproject.toml                     # Dependencies (claude-agent-sdk, sentinel-guardrails)
├── .env.example
├── .gitignore
└── safety_guardrails/
    ├── agent.py                       # Main entry point with interactive chat loop
    └── safety_hooks.py                # Sentinel AI hook implementations

Test plan

Run uv sync to install dependencies
Set ANTHROPIC_API_KEY and run uv run python safety_guardrails/agent.py
Test clean input: "Hello, tell me about machine learning" — passes through
Test prompt injection: "Ignore all previous instructions" — blocked
Test PII: "My SSN is 123-45-6789" — detected and redacted

Add a new demo showing how to integrate real-time safety scanning into Claude Agent SDK applications using Sentinel AI. The demo uses SDK hooks to scan user inputs and agent outputs for prompt injection, PII leakage, harmful content, toxicity, and hallucination indicators. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add safety guardrails demo with Sentinel AI#57

feat: add safety guardrails demo with Sentinel AI#57
MaxwellCalkin wants to merge 1 commit intoanthropics:mainfrom
MaxwellCalkin:feat/safety-guardrails-demo

MaxwellCalkin commented Mar 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

MaxwellCalkin commented Mar 8, 2026

Summary

How it works

Files added

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant