Description
Problem
When a DIAL agent is configured with toolsets (e.g., database access, file system, external APIs), it can call those tools fully autonomously — including in response to malicious instructions delivered via indirect prompt injection. There is currently no mechanism to pause execution and request explicit user approval before a tool call is made.
This creates a critical risk when an agent has what can be described as the "three dangerous legs" simultaneously:
- Fully autonomous execution — the agent calls tools without human confirmation
- Access to the internet — the agent can reach external URLs (via web search, fetch, or other toolsets)
- Access to internal/sensitive data — the agent can read databases, documents, or internal APIs
Any attacker who can place malicious content in data the agent reads (a document, a web page, a database row) can hijack the agent's tool-calling behavior, causing it to exfiltrate, modify, or delete data without any user awareness.
Proposed solution
Add a configurable human-in-the-loop approval gate for toolset calls, similar to the security model implemented in popular AI coding assistants (e.g., Claude Code, Cursor, Windsurf):
- Per-toolset approval mode: each toolset can be configured as
auto (no approval needed) or require-approval (user must confirm before each call).
- Approval UI: when the agent wants to call a tool, display a modal/inline prompt showing:
-
-
- Parameters the agent wants to pass
-
-
- Session-level trust: optionally allow the user to approve a toolset "for this session" to reduce friction for trusted workflows.
-
- Audit log: all tool calls (approved or denied) are logged so administrators can review agent behavior.
Prior art / references
- Claude Code: prompts the user before executing bash commands, writing files, or calling external APIs
- Cursor / Windsurf: "agent mode" shows tool calls and asks for confirmation on destructive or network operations
- OWASP LLM Top 10 – LLM06: Excessive Agency (recommends human oversight for consequential agent actions)
Use case/motivation
In enterprise environments, DIAL is deployed with agents that have toolsets connecting to internal systems (databases, HR systems, internal APIs, document stores). These agents may also browse the web or process external documents.
Without an approval gate, a single successful prompt injection — for example, a malicious instruction hidden in a PDF the agent is asked to summarize — can cause the agent to autonomously call any tool it has access to, with arbitrary parameters, without the user ever knowing.
The combination of autonomous execution + internet access + internal data access is the most dangerous configuration for any AI agent. Popular developer tools have already recognized this and implemented human-in-the-loop gates. DIAL should offer the same protection for enterprise chat users who may be less aware of these risks than developers.
Confidential information
Description
Problem
When a DIAL agent is configured with toolsets (e.g., database access, file system, external APIs), it can call those tools fully autonomously — including in response to malicious instructions delivered via indirect prompt injection. There is currently no mechanism to pause execution and request explicit user approval before a tool call is made.
This creates a critical risk when an agent has what can be described as the "three dangerous legs" simultaneously:
Any attacker who can place malicious content in data the agent reads (a document, a web page, a database row) can hijack the agent's tool-calling behavior, causing it to exfiltrate, modify, or delete data without any user awareness.
Proposed solution
Add a configurable human-in-the-loop approval gate for toolset calls, similar to the security model implemented in popular AI coding assistants (e.g., Claude Code, Cursor, Windsurf):
auto(no approval needed) orrequire-approval(user must confirm before each call).Prior art / references
Use case/motivation
In enterprise environments, DIAL is deployed with agents that have toolsets connecting to internal systems (databases, HR systems, internal APIs, document stores). These agents may also browse the web or process external documents.
Without an approval gate, a single successful prompt injection — for example, a malicious instruction hidden in a PDF the agent is asked to summarize — can cause the agent to autonomously call any tool it has access to, with arbitrary parameters, without the user ever knowing.
The combination of autonomous execution + internet access + internal data access is the most dangerous configuration for any AI agent. Popular developer tools have already recognized this and implemented human-in-the-loop gates. DIAL should offer the same protection for enterprise chat users who may be less aware of these risks than developers.
Confidential information