|
| 1 | +# The Agent Loop |
| 2 | + |
| 3 | +How the Copilot CLI processes a user message end-to-end: from prompt to `session.idle`. |
| 4 | + |
| 5 | +## Architecture |
| 6 | + |
| 7 | +```mermaid |
| 8 | +graph LR |
| 9 | + App["Your App"] -->|send prompt| SDK["SDK Session"] |
| 10 | + SDK -->|JSON-RPC| CLI["Copilot CLI"] |
| 11 | + CLI -->|API calls| LLM["LLM"] |
| 12 | + LLM -->|response| CLI |
| 13 | + CLI -->|events| SDK |
| 14 | + SDK -->|events| App |
| 15 | +``` |
| 16 | + |
| 17 | +The **SDK** is a transport layer — it sends your prompt to the **Copilot CLI** over JSON-RPC and surfaces events back to your app. The **CLI** is the orchestrator that runs the agentic tool-use loop, making one or more LLM API calls until the task is done. |
| 18 | + |
| 19 | +## The Tool-Use Loop |
| 20 | + |
| 21 | +When you call `session.send({ prompt })`, the CLI enters a loop: |
| 22 | + |
| 23 | +```mermaid |
| 24 | +flowchart TD |
| 25 | + A["User prompt"] --> B["LLM API call\n(= one turn)"] |
| 26 | + B --> C{"toolRequests\nin response?"} |
| 27 | + C -->|Yes| D["Execute tools\nCollect results"] |
| 28 | + D -->|"Results fed back\nas next turn input"| B |
| 29 | + C -->|No| E["Final text\nresponse"] |
| 30 | + E --> F(["session.idle"]) |
| 31 | +
|
| 32 | + style B fill:#1a1a2e,stroke:#58a6ff,color:#c9d1d9 |
| 33 | + style D fill:#1a1a2e,stroke:#3fb950,color:#c9d1d9 |
| 34 | + style F fill:#0d1117,stroke:#f0883e,color:#f0883e |
| 35 | +``` |
| 36 | + |
| 37 | +The model sees the **full conversation history** on each call — system prompt, user message, and all prior tool calls and results. |
| 38 | + |
| 39 | +**Key insight:** Each iteration of this loop is exactly one LLM API call, visible as one `assistant.turn_start` / `assistant.turn_end` pair in the event log. There are no hidden calls. |
| 40 | + |
| 41 | +## Turns — What They Are |
| 42 | + |
| 43 | +A **turn** is a single LLM API call and its consequences: |
| 44 | + |
| 45 | +1. The CLI sends the conversation history to the LLM |
| 46 | +2. The LLM responds (possibly with tool requests) |
| 47 | +3. If tools were requested, the CLI executes them |
| 48 | +4. `assistant.turn_end` is emitted |
| 49 | + |
| 50 | +A single user message typically results in **multiple turns**. For example, a question like "how does X work in this codebase?" might produce: |
| 51 | + |
| 52 | +| Turn | What the model does | toolRequests? | |
| 53 | +|------|-------------------|---------------| |
| 54 | +| 1 | Calls `grep` and `glob` to search the codebase | ✅ Yes | |
| 55 | +| 2 | Reads specific files based on search results | ✅ Yes | |
| 56 | +| 3 | Reads more files for deeper context | ✅ Yes | |
| 57 | +| 4 | Produces the final text answer | ❌ No → loop ends | |
| 58 | + |
| 59 | +The model decides on each turn whether to request more tools or produce a final answer. Each call sees the **full accumulated context** (all prior tool calls and results), so it can make an informed decision about whether it has enough information. |
| 60 | + |
| 61 | +## Event Flow for a Multi-Turn Interaction |
| 62 | + |
| 63 | +```mermaid |
| 64 | +flowchart TD |
| 65 | + send["session.send({ prompt: "Fix the bug in auth.ts" })"] |
| 66 | +
|
| 67 | + subgraph Turn1 ["Turn 1"] |
| 68 | + t1s["assistant.turn_start"] |
| 69 | + t1m["assistant.message (toolRequests)"] |
| 70 | + t1ts["tool.execution_start (read_file)"] |
| 71 | + t1tc["tool.execution_complete"] |
| 72 | + t1e["assistant.turn_end"] |
| 73 | + t1s --> t1m --> t1ts --> t1tc --> t1e |
| 74 | + end |
| 75 | +
|
| 76 | + subgraph Turn2 ["Turn 2 — auto-triggered by CLI"] |
| 77 | + t2s["assistant.turn_start"] |
| 78 | + t2m["assistant.message (toolRequests)"] |
| 79 | + t2ts["tool.execution_start (edit_file)"] |
| 80 | + t2tc["tool.execution_complete"] |
| 81 | + t2e["assistant.turn_end"] |
| 82 | + t2s --> t2m --> t2ts --> t2tc --> t2e |
| 83 | + end |
| 84 | +
|
| 85 | + subgraph Turn3 ["Turn 3"] |
| 86 | + t3s["assistant.turn_start"] |
| 87 | + t3m["assistant.message (no toolRequests)\n"Done, here's what I changed""] |
| 88 | + t3e["assistant.turn_end"] |
| 89 | + t3s --> t3m --> t3e |
| 90 | + end |
| 91 | +
|
| 92 | + idle(["session.idle — ready for next message"]) |
| 93 | +
|
| 94 | + send --> Turn1 --> Turn2 --> Turn3 --> idle |
| 95 | +``` |
| 96 | + |
| 97 | +## Who Triggers Each Turn? |
| 98 | + |
| 99 | +| Actor | Responsibility | |
| 100 | +|-------|---------------| |
| 101 | +| **Your app** | Sends the initial prompt via `session.send()` | |
| 102 | +| **Copilot CLI** | Runs the tool-use loop — executes tools and feeds results back to the LLM for the next turn | |
| 103 | +| **LLM** | Decides whether to request tools (continue looping) or produce a final response (stop) | |
| 104 | +| **SDK** | Passes events through; does not control the loop | |
| 105 | + |
| 106 | +The CLI is purely mechanical: "model asked for tools → execute → call model again." The **model** is the decision-maker for when to stop. |
| 107 | + |
| 108 | +## `session.idle` vs `session.task_complete` |
| 109 | + |
| 110 | +These are two different completion signals with very different guarantees: |
| 111 | + |
| 112 | +### `session.idle` |
| 113 | + |
| 114 | +- **Always emitted** when the tool-use loop ends |
| 115 | +- **Ephemeral** — not persisted to disk, not replayed on session resume |
| 116 | +- Means: "the agent has stopped processing and is ready for the next message" |
| 117 | +- **Use this** as your reliable "done" signal |
| 118 | + |
| 119 | +The SDK's `sendAndWait()` method waits for this event: |
| 120 | + |
| 121 | +```typescript |
| 122 | +// Blocks until session.idle fires |
| 123 | +const response = await session.sendAndWait({ prompt: "Fix the bug" }); |
| 124 | +``` |
| 125 | + |
| 126 | +### `session.task_complete` |
| 127 | + |
| 128 | +- **Optionally emitted** — requires the model to explicitly signal it |
| 129 | +- **Persisted** — saved to the session event log on disk |
| 130 | +- Means: "the agent considers the overall task fulfilled" |
| 131 | +- Carries an optional `summary` field |
| 132 | + |
| 133 | +```typescript |
| 134 | +session.on("session.task_complete", (event) => { |
| 135 | + console.log("Task done:", event.data.summary); |
| 136 | +}); |
| 137 | +``` |
| 138 | + |
| 139 | +### Autopilot mode: the CLI nudges for `task_complete` |
| 140 | + |
| 141 | +In **autopilot mode** (headless/autonomous operation), the CLI actively tracks whether the model has called `task_complete`. If the tool-use loop ends without it, the CLI injects a synthetic user message nudging the model: |
| 142 | + |
| 143 | +> *"You have not yet marked the task as complete using the task_complete tool. If you were planning, stop planning and start implementing. You aren't done until you have fully completed the task."* |
| 144 | +
|
| 145 | +This effectively restarts the tool-use loop — the model sees the nudge as a new user message and continues working. The nudge also instructs the model **not** to call `task_complete` prematurely: |
| 146 | + |
| 147 | +- Don't call it if you have open questions — make decisions and keep working |
| 148 | +- Don't call it if you hit an error — try to resolve it |
| 149 | +- Don't call it if there are remaining steps — complete them first |
| 150 | + |
| 151 | +This creates a **two-level completion mechanism** in autopilot: |
| 152 | +1. The model calls `task_complete` with a summary → CLI emits `session.task_complete` → done |
| 153 | +2. The model stops without calling it → CLI nudges → model continues or calls `task_complete` |
| 154 | + |
| 155 | +### Why `task_complete` might not appear |
| 156 | + |
| 157 | +In **interactive mode** (normal chat), the CLI does not nudge for `task_complete`. The model may skip it entirely. Common reasons: |
| 158 | + |
| 159 | +- **Conversational Q&A**: The model answers a question and simply stops — there's no discrete "task" to complete |
| 160 | +- **Model discretion**: The model produces a final text response without calling the task-complete signal |
| 161 | +- **Interrupted sessions**: The session ends before the model reaches a completion point |
| 162 | + |
| 163 | +The CLI emits `session.idle` regardless, because it's a mechanical signal (the loop ended), not a semantic one (the model thinks it's done). |
| 164 | + |
| 165 | +### Which should you use? |
| 166 | + |
| 167 | +| Use case | Signal | |
| 168 | +|----------|--------| |
| 169 | +| "Wait for the agent to finish processing" | `session.idle` ✅ | |
| 170 | +| "Know when a coding task is done" | `session.task_complete` (best-effort) | |
| 171 | +| "Timeout/error handling" | `session.idle` + `session.error` ✅ | |
| 172 | + |
| 173 | +## Counting LLM Calls |
| 174 | + |
| 175 | +The number of `assistant.turn_start` / `assistant.turn_end` pairs in the event log equals the total number of LLM API calls made. There are no hidden calls for planning, evaluation, or completion checking. |
| 176 | + |
| 177 | +To inspect turn count for a session: |
| 178 | + |
| 179 | +```bash |
| 180 | +# Count turns in a session's event log |
| 181 | +grep -c "assistant.turn_start" ~/.copilot/session-state/<sessionId>/events.jsonl |
| 182 | +``` |
| 183 | + |
| 184 | +## Further Reading |
| 185 | + |
| 186 | +- [Streaming Events Reference](./streaming-events.md) — Full field-level reference for every event type |
| 187 | +- [Session Persistence](./session-persistence.md) — How sessions are saved and resumed |
| 188 | +- [Hooks](./hooks.md) — Intercepting events in the loop (permissions, tools) |
0 commit comments