Skip to content

Commit da9921e

Browse files
authored
docs: add agent-loop.md explaining the tool-use loop and completion signals (#1010)
* Add agent-loop.md describing the agent-loop as seen from the perspective of an app using the copilot-sdk * Add new feature 'The Agent Loop' to documentation
1 parent 2ac20f0 commit da9921e

File tree

2 files changed

+189
-0
lines changed

2 files changed

+189
-0
lines changed

docs/features/agent-loop.md

Lines changed: 188 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,188 @@
1+
# The Agent Loop
2+
3+
How the Copilot CLI processes a user message end-to-end: from prompt to `session.idle`.
4+
5+
## Architecture
6+
7+
```mermaid
8+
graph LR
9+
App["Your App"] -->|send prompt| SDK["SDK Session"]
10+
SDK -->|JSON-RPC| CLI["Copilot CLI"]
11+
CLI -->|API calls| LLM["LLM"]
12+
LLM -->|response| CLI
13+
CLI -->|events| SDK
14+
SDK -->|events| App
15+
```
16+
17+
The **SDK** is a transport layer — it sends your prompt to the **Copilot CLI** over JSON-RPC and surfaces events back to your app. The **CLI** is the orchestrator that runs the agentic tool-use loop, making one or more LLM API calls until the task is done.
18+
19+
## The Tool-Use Loop
20+
21+
When you call `session.send({ prompt })`, the CLI enters a loop:
22+
23+
```mermaid
24+
flowchart TD
25+
A["User prompt"] --> B["LLM API call\n(= one turn)"]
26+
B --> C{"toolRequests\nin response?"}
27+
C -->|Yes| D["Execute tools\nCollect results"]
28+
D -->|"Results fed back\nas next turn input"| B
29+
C -->|No| E["Final text\nresponse"]
30+
E --> F(["session.idle"])
31+
32+
style B fill:#1a1a2e,stroke:#58a6ff,color:#c9d1d9
33+
style D fill:#1a1a2e,stroke:#3fb950,color:#c9d1d9
34+
style F fill:#0d1117,stroke:#f0883e,color:#f0883e
35+
```
36+
37+
The model sees the **full conversation history** on each call — system prompt, user message, and all prior tool calls and results.
38+
39+
**Key insight:** Each iteration of this loop is exactly one LLM API call, visible as one `assistant.turn_start` / `assistant.turn_end` pair in the event log. There are no hidden calls.
40+
41+
## Turns — What They Are
42+
43+
A **turn** is a single LLM API call and its consequences:
44+
45+
1. The CLI sends the conversation history to the LLM
46+
2. The LLM responds (possibly with tool requests)
47+
3. If tools were requested, the CLI executes them
48+
4. `assistant.turn_end` is emitted
49+
50+
A single user message typically results in **multiple turns**. For example, a question like "how does X work in this codebase?" might produce:
51+
52+
| Turn | What the model does | toolRequests? |
53+
|------|-------------------|---------------|
54+
| 1 | Calls `grep` and `glob` to search the codebase | ✅ Yes |
55+
| 2 | Reads specific files based on search results | ✅ Yes |
56+
| 3 | Reads more files for deeper context | ✅ Yes |
57+
| 4 | Produces the final text answer | ❌ No → loop ends |
58+
59+
The model decides on each turn whether to request more tools or produce a final answer. Each call sees the **full accumulated context** (all prior tool calls and results), so it can make an informed decision about whether it has enough information.
60+
61+
## Event Flow for a Multi-Turn Interaction
62+
63+
```mermaid
64+
flowchart TD
65+
send["session.send({ prompt: "Fix the bug in auth.ts" })"]
66+
67+
subgraph Turn1 ["Turn 1"]
68+
t1s["assistant.turn_start"]
69+
t1m["assistant.message (toolRequests)"]
70+
t1ts["tool.execution_start (read_file)"]
71+
t1tc["tool.execution_complete"]
72+
t1e["assistant.turn_end"]
73+
t1s --> t1m --> t1ts --> t1tc --> t1e
74+
end
75+
76+
subgraph Turn2 ["Turn 2 — auto-triggered by CLI"]
77+
t2s["assistant.turn_start"]
78+
t2m["assistant.message (toolRequests)"]
79+
t2ts["tool.execution_start (edit_file)"]
80+
t2tc["tool.execution_complete"]
81+
t2e["assistant.turn_end"]
82+
t2s --> t2m --> t2ts --> t2tc --> t2e
83+
end
84+
85+
subgraph Turn3 ["Turn 3"]
86+
t3s["assistant.turn_start"]
87+
t3m["assistant.message (no toolRequests)\n"Done, here's what I changed""]
88+
t3e["assistant.turn_end"]
89+
t3s --> t3m --> t3e
90+
end
91+
92+
idle(["session.idle — ready for next message"])
93+
94+
send --> Turn1 --> Turn2 --> Turn3 --> idle
95+
```
96+
97+
## Who Triggers Each Turn?
98+
99+
| Actor | Responsibility |
100+
|-------|---------------|
101+
| **Your app** | Sends the initial prompt via `session.send()` |
102+
| **Copilot CLI** | Runs the tool-use loop — executes tools and feeds results back to the LLM for the next turn |
103+
| **LLM** | Decides whether to request tools (continue looping) or produce a final response (stop) |
104+
| **SDK** | Passes events through; does not control the loop |
105+
106+
The CLI is purely mechanical: "model asked for tools → execute → call model again." The **model** is the decision-maker for when to stop.
107+
108+
## `session.idle` vs `session.task_complete`
109+
110+
These are two different completion signals with very different guarantees:
111+
112+
### `session.idle`
113+
114+
- **Always emitted** when the tool-use loop ends
115+
- **Ephemeral** — not persisted to disk, not replayed on session resume
116+
- Means: "the agent has stopped processing and is ready for the next message"
117+
- **Use this** as your reliable "done" signal
118+
119+
The SDK's `sendAndWait()` method waits for this event:
120+
121+
```typescript
122+
// Blocks until session.idle fires
123+
const response = await session.sendAndWait({ prompt: "Fix the bug" });
124+
```
125+
126+
### `session.task_complete`
127+
128+
- **Optionally emitted** — requires the model to explicitly signal it
129+
- **Persisted** — saved to the session event log on disk
130+
- Means: "the agent considers the overall task fulfilled"
131+
- Carries an optional `summary` field
132+
133+
```typescript
134+
session.on("session.task_complete", (event) => {
135+
console.log("Task done:", event.data.summary);
136+
});
137+
```
138+
139+
### Autopilot mode: the CLI nudges for `task_complete`
140+
141+
In **autopilot mode** (headless/autonomous operation), the CLI actively tracks whether the model has called `task_complete`. If the tool-use loop ends without it, the CLI injects a synthetic user message nudging the model:
142+
143+
> *"You have not yet marked the task as complete using the task_complete tool. If you were planning, stop planning and start implementing. You aren't done until you have fully completed the task."*
144+
145+
This effectively restarts the tool-use loop — the model sees the nudge as a new user message and continues working. The nudge also instructs the model **not** to call `task_complete` prematurely:
146+
147+
- Don't call it if you have open questions — make decisions and keep working
148+
- Don't call it if you hit an error — try to resolve it
149+
- Don't call it if there are remaining steps — complete them first
150+
151+
This creates a **two-level completion mechanism** in autopilot:
152+
1. The model calls `task_complete` with a summary → CLI emits `session.task_complete` → done
153+
2. The model stops without calling it → CLI nudges → model continues or calls `task_complete`
154+
155+
### Why `task_complete` might not appear
156+
157+
In **interactive mode** (normal chat), the CLI does not nudge for `task_complete`. The model may skip it entirely. Common reasons:
158+
159+
- **Conversational Q&A**: The model answers a question and simply stops — there's no discrete "task" to complete
160+
- **Model discretion**: The model produces a final text response without calling the task-complete signal
161+
- **Interrupted sessions**: The session ends before the model reaches a completion point
162+
163+
The CLI emits `session.idle` regardless, because it's a mechanical signal (the loop ended), not a semantic one (the model thinks it's done).
164+
165+
### Which should you use?
166+
167+
| Use case | Signal |
168+
|----------|--------|
169+
| "Wait for the agent to finish processing" | `session.idle`|
170+
| "Know when a coding task is done" | `session.task_complete` (best-effort) |
171+
| "Timeout/error handling" | `session.idle` + `session.error`|
172+
173+
## Counting LLM Calls
174+
175+
The number of `assistant.turn_start` / `assistant.turn_end` pairs in the event log equals the total number of LLM API calls made. There are no hidden calls for planning, evaluation, or completion checking.
176+
177+
To inspect turn count for a session:
178+
179+
```bash
180+
# Count turns in a session's event log
181+
grep -c "assistant.turn_start" ~/.copilot/session-state/<sessionId>/events.jsonl
182+
```
183+
184+
## Further Reading
185+
186+
- [Streaming Events Reference](./streaming-events.md) — Full field-level reference for every event type
187+
- [Session Persistence](./session-persistence.md) — How sessions are saved and resumed
188+
- [Hooks](./hooks.md) — Intercepting events in the loop (permissions, tools)

docs/features/index.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@ These guides cover the capabilities you can add to your Copilot SDK application.
88

99
| Feature | Description |
1010
|---|---|
11+
| [The Agent Loop](./agent-loop.md) | How the CLI processes a prompt — the tool-use loop, turns, and completion signals |
1112
| [Hooks](./hooks.md) | Intercept and customize session behavior — control tool execution, transform results, handle errors |
1213
| [Custom Agents](./custom-agents.md) | Define specialized sub-agents with scoped tools and instructions |
1314
| [MCP Servers](./mcp.md) | Integrate Model Context Protocol servers for external tool access |

0 commit comments

Comments
 (0)