Add pending message queue and background tool execution by DouweM · Pull Request #4980 · pydantic/pydantic-ai

DouweM · 2026-04-04T20:47:01Z

Supersedes #4979 (which had merge conflicts due to branch history).

Summary

Adds a pending message queue (ctx.enqueue_message() / agent_run.enqueue_message()) with 'steering' and 'follow_up' priorities, powered by an auto-injected PendingMessageDrainCapability
Adds background tool execution (@agent.tool(background=True) / Tool(background=True) / ToolDefinition.background), powered by an auto-injected BackgroundToolCapability that spawns asyncio tasks, returns immediate acks, and delivers results as follow-up messages
Both features are framework-level with clean user-facing APIs, internally implemented via auto-injected capabilities that dogfood the capability abstraction

Test plan

8 new tests covering steering drain, follow-up drain preventing End, external enqueue via AgentRun, background tool execution, error handling, ack messages, non-background tools unaffected, RunContext queue access
All existing capability, agent, tool, toolset, and logfire tests pass
Full pyright: 0 errors
ruff lint + format: clean

🤖 Generated with Claude Code

Introduces two framework-level features, both powered by auto-injected capabilities: **Pending Message Queue:** - `PendingMessage` type with `'steering'` and `'follow_up'` priorities - `enqueue_message()` on `RunContext` (from tools/hooks) and `AgentRun` (external code) - `PendingMessageDrainCapability` drains steering messages before model requests and follow-up messages when the agent would otherwise end **Background Tools:** - `ToolDefinition.background` field, `Tool(background=True)`, `@agent.tool(background=True)` - `BackgroundToolCapability` spawns asyncio tasks, returns immediate ack, delivers results as follow-up messages, waits before End, cleans up on run end Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

github-actions · 2026-04-04T21:08:17Z

Docs Preview

commit:	`d12f52e`
Preview URL:	https://32f85f24-pydantic-ai-previews.pydantic.workers.dev

devin-ai-integration

Devin Review found 13 new potential issues.

View 3 additional findings in Devin Review.

devin-ai-integration · 2026-04-04T21:12:07Z

pydantic_ai_slim/pydantic_ai/run.py

🚩 async for iteration path does NOT fire after_node_run hooks, breaking pending message drain

The AgentRun.__anext__ method at run.py:193-210 directly calls anext(self._graph_run) without firing any capability hooks (before_node_run, after_node_run, etc.). The existing warning in __aiter__ (run.py:183-189) only checks for has_wrap_node_run and warns about wrap_node_run hooks. However, the new PendingMessageDrainCapability and BackgroundToolCapability rely on after_node_run hooks, which also don't fire through async for. Users who iterate via async for node in agent_run: instead of agent_run.next(node) will silently miss pending message draining and background task waiting. The warning should be updated to mention after_node_run hooks as well, or these capabilities should be checked in __aiter__.

(Refers to lines 183-189)

Was this helpful? React with 👍 or 👎 to provide feedback.

devin-ai-integration · 2026-04-04T21:12:09Z

pydantic_ai_slim/pydantic_ai/capabilities/_background_tools.py

+        # Wait for at least one task to complete
+        self._completion_event.clear()
+        await self._completion_event.wait()


🔴 Race condition: _completion_event.clear() can drop already-set signals from completed tasks

In BackgroundToolCapability.after_node_run, between the check not self._tasks and self._completion_event.clear(), a background task could have completed during an earlier await in the agent loop (e.g. during the model request that produced the End result). If a task completed and called self._completion_event.set() but there are still OTHER pending tasks in self._tasks, the code enters the wait path, calls clear() (discarding the signal from the already-completed task), and then wait()s. If the remaining tasks also happened to complete (and remove themselves from _tasks) before the event loop yields back into _run(), the event will never be re-set, and wait() will block indefinitely.

While asyncio's cooperative scheduling reduces the window, the await self._completion_event.wait() itself is a yield point — if ALL remaining tasks complete and set the event during the same tick as clear(), the clear-then-wait pattern loses the signal. A safer approach would use asyncio.wait() on the actual task objects rather than a separate Event.

Prompt for agents

In BackgroundToolCapability.after_node_run (lines 95-104), the clear-then-wait pattern on a shared asyncio.Event can lose signals when tasks complete between clear() and wait(). Consider replacing the Event-based approach with asyncio.wait() on the actual task objects, e.g.: done, _ = await asyncio.wait(list(self._tasks.values()), return_when=asyncio.FIRST_COMPLETED) This directly waits for at least one task to finish without risk of losing signals. The _completion_event field could then be removed entirely.

Was this helpful? React with 👍 or 👎 to provide feedback.

devin-ai-integration · 2026-04-04T21:12:10Z

pydantic_ai_slim/pydantic_ai/capabilities/_pending_messages.py

+        drained = _drain_by_priority(ctx.pending_messages, 'steering')
+        if drained:
+            parts = [part for msg in drained for part in msg.parts]
+            request_context.messages.append(ModelRequest(parts=parts))
+        return request_context


🔴 Steering messages are appended to request_context.messages (the model's view) but not to state.message_history (the persistent history)

In PendingMessageDrainCapability.before_model_request (_pending_messages.py:58-61), steering messages are drained and appended to request_context.messages. However, request_context.messages is a copy of state.message_history (created at _agent_graph.py:808 via ctx.state.message_history[:]). After the model request completes, _agent_graph.py:839 writes messages back: ctx.state.message_history[:] = messages. But messages is set from request_context.messages at line 819 (messages = request_context.messages). So the steering messages are written back.

However, the new ModelRequest containing the steering parts is appended after the existing messages. The model sees it as a separate request interleaved in the conversation. But crucially, it's appended before the actual request that triggered this model call. This means the steering message appears in the conversation before the tool return / user prompt that preceded the model call, which may confuse the model about conversation ordering. The steering ModelRequest has no run_id or timestamp set, unlike the other requests (see _agent_graph.py:830-834), making it inconsistent with the rest of the message history.

Prompt for agents

The PendingMessageDrainCapability.before_model_request appends a new ModelRequest to request_context.messages without setting run_id or timestamp. All other ModelRequests in the message history have these fields set (see _agent_graph.py lines 830-834 which ensure the last request has a timestamp and run_id). The injected steering ModelRequest will be missing these fields, making it inconsistent with the rest of the message history and potentially breaking serialization or tooling that expects these fields. At minimum, the ModelRequest created in _pending_messages.py:61 should include timestamp and run_id fields. You could pass them via the RunContext (which has run_id) and use now_utc() for the timestamp. For example: from pydantic_ai._utils import now_utc request_context.messages.append(ModelRequest(parts=parts, run_id=ctx.run_id, timestamp=now_utc())) Also consider whether appending the steering message as a separate ModelRequest at the end of messages is the right approach, vs merging the parts into the existing last ModelRequest.

Was this helpful? React with 👍 or 👎 to provide feedback.

devin-ai-integration · 2026-04-04T21:12:11Z

pydantic_ai_slim/pydantic_ai/run.py

+        if not isinstance(result, End) and self._graph_run.output is not None:
+            self._graph_run._next = [self._node_to_task(result)]  # pyright: ignore[reportPrivateUsage,reportAttributeAccessIssue]


🔴 Private _graph_run._next hack assigns GraphTaskRequest instead of GraphTask, bypassing required conversion

At run.py:270, the code directly assigns [self._node_to_task(result)] to self._graph_run._next. The _node_to_task method returns GraphTaskRequest objects, but GraphRun._next expects EndMarker | Sequence[GraphTask] | None. The normal path through GraphRun.next() converts GraphTaskRequest to GraphTask via GraphTask.from_request() — this direct assignment bypasses that conversion.

While the value is typically overwritten on the next GraphRun.next() call, any intermediate access to self._graph_run.next_task would return objects of the wrong type. More critically, if the underlying graph async generator has already returned (exhausted after yielding EndMarker), subsequent asend() calls will raise StopAsyncIteration, which _advance_graph silently catches — causing it to return the input node unchanged, potentially creating an infinite loop.

Prompt for agents

The hack at run.py:269-270 directly mutates the private _next field of GraphRun, bypassing the GraphTaskRequest → GraphTask conversion that GraphRun.next() normally performs. This also assumes the graph's internal async generator is still alive after yielding EndMarker, but if the generator has returned (exhausted), subsequent asend() calls raise StopAsyncIteration which _advance_graph silently catches, leading to the node being returned unchanged (infinite loop). A more robust approach would be to extend the GraphRun API to support 'un-ending' a run (e.g. a public method like reset_end(tasks) that properly converts GraphTaskRequests and restarts the iterator), rather than directly manipulating private internals. Alternatively, consider whether after_node_run should be able to convert End results to continuation nodes at all — perhaps the pending message drain should happen at a different point in the lifecycle.

Was this helpful? React with 👍 or 👎 to provide feedback.

devin-ai-integration · 2026-04-04T21:12:12Z

pydantic_ai_slim/pydantic_ai/capabilities/_background_tools.py

+        async def _run() -> None:
+            try:
+                result = await handler(args)
+                ctx.enqueue_message(
+                    SystemPromptPart(f"Background tool '{tool_name}' (task {task_id}) completed.\nResult: {result}"),
+                    priority='follow_up',
+                )
+            except Exception as e:
+                ctx.enqueue_message(
+                    SystemPromptPart(f"Background tool '{tool_name}' (task {task_id}) failed: {e}"),
+                    priority='follow_up',
+                )
+            finally:
+                self._tasks.pop(task_id, None)
+                self._completion_event.set()
+


🟡 Background task captures ctx (RunContext) closure that may be stale or reused across steps

In BackgroundToolCapability.wrap_tool_execute (_background_tools.py:62-76), the _run() async closure captures ctx (a RunContext) by reference. This ctx object is used later (potentially much later, after the task completes) to call ctx.enqueue_message(...). However, RunContext instances are created per-step via build_run_context (_agent_graph.py:1223-1247), and pending_messages is a shared deque from state.pending_messages. The ctx captured here is the one from the tool execution step, which should still reference the correct shared pending_messages deque. However, ctx also contains messages, run_step, usage, and other step-specific state that will be stale when the background task completes. If the handler(args) call within the background task accesses ctx (which it does via the tool function's RunContext), it will see outdated messages and run_step values. This is a correctness issue for tools that depend on ctx.messages being current.

Was this helpful? React with 👍 or 👎 to provide feedback.

devin-ai-integration · 2026-04-04T21:12:16Z

pydantic_ai_slim/pydantic_ai/agent/__init__.py

+        auto_caps: list[AbstractCapability[AgentDepsT]] = [
+            PendingMessageDrainCapability(),
+            BackgroundToolCapability(),
+        ]
+        self._root_capability = CombinedCapability(auto_caps + capabilities)


🚩 Auto-injected capabilities add overhead to every agent, even without background tools

Both PendingMessageDrainCapability and BackgroundToolCapability are unconditionally prepended to every agent's capability list (agent/__init__.py:413-417). This means every agent run — even those that never use background tools or pending messages — pays the cost of:

for_run() creating a fresh BackgroundToolCapability instance every run (_background_tools.py:42-44)

before_model_request iterating the pending_messages deque (even when empty) on every model request

after_node_run checking isinstance(result, End) and self._tasks on every node

wrap_run wrapping the entire run with a try/finally

wrap_tool_execute checking tool_def.background for every single tool call

For the vast majority of agents that don't use these features, this is unnecessary overhead. A more targeted approach would be to only inject these capabilities when background=True tools are actually registered, or to make the checks essentially free (e.g., skip for_run allocation when no background tools exist).

Was this helpful? React with 👍 or 👎 to provide feedback.

devin-ai-integration · 2026-04-04T21:12:17Z

pydantic_ai_slim/pydantic_ai/agent/__init__.py

+        auto_caps: list[AbstractCapability[AgentDepsT]] = [
+            PendingMessageDrainCapability(),
+            BackgroundToolCapability(),
+        ]
+        self._root_capability = CombinedCapability(auto_caps + capabilities)


🚩 Capability hook ordering relies on position in list, which is fragile

The after_node_run hooks in CombinedCapability (combined.py:184) execute in reversed order. The auto-injected capabilities at agent/__init__.py:413-417 are prepended as [PendingMessageDrainCapability, BackgroundToolCapability, ...user_caps]. In reversed order, BackgroundToolCapability.after_node_run runs before PendingMessageDrainCapability.after_node_run, which is critical: Background waits for task completion and enqueues follow-ups, then PendingMessageDrain converts End → ModelRequestNode. If a user adds a capability via from_spec or override that also manipulates after_node_run, the ordering could break. The coupling between these two auto-injected capabilities via positional ordering in the list is an implicit contract that isn't documented or enforced.

Was this helpful? React with 👍 or 👎 to provide feedback.

devin-ai-integration · 2026-04-04T21:12:18Z

pydantic_ai_slim/pydantic_ai/capabilities/_pending_messages.py

+        follow_ups = _drain_by_priority(ctx.pending_messages, 'follow_up')
+        if not follow_ups:
+            return result
+
+        parts = [part for msg in follow_ups for part in msg.parts]
+        request = ModelRequest(parts=parts)
+        return ModelRequestNode(request=request)


🚩 Follow-up drain creates ModelRequestNode without run_step increment or usage limit check

When PendingMessageDrainCapability.after_node_run redirects End to a ModelRequestNode (_pending_messages.py:83-84), this creates a new model request that bypasses the normal UserPromptNode flow. The new ModelRequestNode will trigger a model call, consuming tokens and incrementing usage. However, there's no explicit usage_limits check before this redirect — the check happens inside ModelRequestNode.run(). If the agent is near its usage limit, this redirect could cause UsageLimitExceeded to be raised during the follow-up model call, which would lose the follow-up context. This is arguably correct behavior (limits should be respected), but users may find it surprising that background tool results are silently lost when usage limits are reached.

Was this helpful? React with 👍 or 👎 to provide feedback.

devin-ai-integration · 2026-04-04T21:12:19Z

pydantic_ai_slim/pydantic_ai/capabilities/_background_tools.py

+        async def _run() -> None:
+            try:
+                result = await handler(args)
+                ctx.enqueue_message(
+                    SystemPromptPart(f"Background tool '{tool_name}' (task {task_id}) completed.\nResult: {result}"),
+                    priority='follow_up',
+                )


🚩 Background tool results delivered as SystemPromptPart may confuse models

In _background_tools.py:65-67, completed background tool results are delivered as SystemPromptPart messages. This is semantically odd — the result of a tool call is packaged as a system prompt rather than a ToolReturnPart or UserPromptPart. Some models may treat system prompts differently (e.g., with higher weight or special formatting). Since the original tool call already received an acknowledgment ToolReturnPart, there's no corresponding tool_call_id for a follow-up ToolReturnPart. Using SystemPromptPart is a pragmatic choice given the constraints, but it could lead to unexpected model behavior, especially with models that enforce strict system/user/assistant role alternation.

Was this helpful? React with 👍 or 👎 to provide feedback.

devin-ai-integration · 2026-04-04T21:12:20Z

pydantic_ai_slim/pydantic_ai/__init__.py

+    PendingMessage,
+    PendingMessagePriority,


🚩 No documentation added for the pending messages and background tools features

The AGENTS.md rules state all changes need to 'update/add all relevant documentation, following the existing voice and patterns'. This PR adds significant new public APIs (PendingMessage, PendingMessagePriority, RunContext.enqueue_message, AgentRun.enqueue_message, background=True tool parameter) but no documentation files were changed. The features are exported in __init__.py and have docstrings, but there are no docs/ changes, no guide pages, and no updates to existing tool documentation to describe background tools or the pending message queue. This is flagged as an analysis rather than a bug since the PR may be WIP or documentation may be planned separately.

Was this helpful? React with 👍 or 👎 to provide feedback.

devin-ai-integration

Devin Review found 1 new potential issue.

View 5 additional findings in Devin Review.

devin-ai-integration · 2026-04-04T22:00:03Z

pydantic_ai_slim/pydantic_ai/run.py

+        # If after_node_run redirected End to a new node (e.g. pending message drain),
+        # clear the graph run's end state so the run loop continues.
+        if not isinstance(result, End) and self._graph_run.output is not None:
+            self._graph_run._next = [self._node_to_task(result)]  # pyright: ignore[reportPrivateUsage,reportAttributeAccessIssue]
+


🚩 Direct mutation of private GraphRun._next to override End state is fragile

At pydantic_ai_slim/pydantic_ai/run.py:269-270, the code directly mutates self._graph_run._next to clear the EndMarker when after_node_run transforms an End into a new node. This works because (1) GraphRun.output checks isinstance(self._next, EndMarker) and returns None for non-EndMarker values, and (2) the next call to GraphRun.next() overwrites _next with properly-typed GraphTask objects. However, the value written ([GraphTaskRequest]) doesn't match _next's expected type (Sequence[GraphTask]). This is a private API that could break silently if pydantic_graph changes its internals. The pyright: ignore comment acknowledges this. Consider whether GraphRun should expose a public method for resetting end state.

Was this helpful? React with 👍 or 👎 to provide feedback.

github-actions bot added size: L Large PR (501-1500 weighted lines) feature New feature request, or PR implementing a feature (enhancement) labels Apr 4, 2026

Update model_request_parameters snapshot for background field

71b57a9

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

devin-ai-integration bot reviewed Apr 4, 2026

View reviewed changes

retrigger CI

d12f52e

devin-ai-integration bot reviewed Apr 4, 2026

View reviewed changes

		if not isinstance(result, End) and self._graph_run.output is not None:
		self._graph_run._next = [self._node_to_task(result)] # pyright: ignore[reportPrivateUsage,reportAttributeAccessIssue]

Conversation

DouweM commented Apr 4, 2026

Summary

Test plan

Uh oh!

github-actions bot commented Apr 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Docs Preview

Uh oh!

devin-ai-integration bot left a comment

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration bot Apr 4, 2026

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration bot Apr 4, 2026

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration bot Apr 4, 2026

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration bot Apr 4, 2026

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration bot Apr 4, 2026

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration bot Apr 4, 2026

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration bot Apr 4, 2026

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration bot Apr 4, 2026

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration bot Apr 4, 2026

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration bot Apr 4, 2026

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration bot left a comment

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration bot Apr 4, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

github-actions bot commented Apr 4, 2026 •

edited

Loading