Skip to content

Commit 0c40394

Browse files
committed
docs(ai-agents): foreground the debugger over the REPL
Make robot-debug the primary tool for working with a real test and the REPL the narrower fallback for exploring when no test exists yet. Add debugger examples, a debugger-first core-habit entry, a troubleshooting item for chasing a failing test instead of debugging it, and note that agent output capture and the plain backend apply to robot-debug too.
1 parent f5a3987 commit 0c40394

1 file changed

Lines changed: 12 additions & 6 deletions

File tree

docs/03_reference/ai-agents.md

Lines changed: 12 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
AI coding agents are good at writing code and bad at guessing how *your* Robot Framework project is actually wired. Which files become suites, which tags a test really ends up with, where a keyword comes from, what a finished run contained — none of that is reliably visible by reading `.robot` files. It is decided at runtime by `robot.toml`, profiles, variables, the installed library versions, and pre-run modifiers.
44

5-
**RobotCode** closes that gap by teaching the agent to work through the project's own [`robotcode`](cli.md) CLI instead of guessing. The agent discovers tests, runs suites, inspects results, looks up keywords, and explores live in the REPL — all through the same resolved view of the project that the rest of **RobotCode** uses. The result is an agent that behaves less like a generic code model and more like a Robot Framework engineer who knows your setup.
5+
**RobotCode** closes that gap by teaching the agent to work through the project's own [`robotcode`](cli.md) CLI instead of guessing. The agent discovers tests, runs suites, inspects results, looks up keywords, debugs failing tests at a real breakpoint, and explores live in the REPL — all through the same resolved view of the project that the rest of **RobotCode** uses. The result is an agent that behaves less like a generic code model and more like a Robot Framework engineer who knows your setup.
66

77
There are two pieces, and they work independently:
88

@@ -22,10 +22,12 @@ Once the plugin is active, everyday Robot Framework requests are handled through
2222
- *"run the smoke suite with the `ci` profile"* — runs it through the selected profile and reports pass/fail counts
2323
- *"rerun just the tests that failed last time"*
2424
- *"why did `Login Works` fail in last night's run?"* — inspects the existing results, no re-run
25+
- *"this test keeps failing — step through it and tell me what `${response}` is when it breaks"* — pauses the real run in the [debugger](robot-debug.md) and reads the live stack and variables, instead of re-running blindly or guessing
26+
- *"break at `login.robot:42` and show me the variables there"*
2527
- *"what tests and tags exist?"* — resolves the real set with `discover` (paths, profiles, variables, pre-run modifiers), not a file scan
2628
- *"what arguments does our `Create Order` keyword take?"* — looks it up against the installed libraries and local resources
2729
- *"is there already a keyword for waiting until the spinner is gone?"*
28-
- *"try the new login flow against the real app, with a visible browser"* — drives it live in the REPL
30+
- *"try the new login flow against the real app, with a visible browser"* — drives it live in the REPL (no test written yet)
2931
- *"set up a `prod` profile with `BASE_URL=https://prod.example.com`"*
3032
- *"lint only the files I changed today"*
3133

@@ -92,7 +94,8 @@ The plugin currently contains a single *skill* — a set of instructions the age
9294

9395
- **Inventory via [`discover`](discovering-tests.md), never by grepping files.** Which tests, tasks, suites, and tags exist is resolved at runtime — Robot's parsing rules, `robot.toml` paths, profiles, variables, and pre-run modifiers that add, remove, rename, or retag tests. A static file scan gets it wrong; `discover` runs the real resolution with the installed Robot Framework.
9496
- **Keyword and library lookup via [`libdoc`](cli.md#libdoc), before generic knowledge.** `libdoc` reflects the *installed* library versions, the project's import arguments, the Python path, and local `.resource` files — things external documentation can't see.
95-
- **Live exploration via the [REPL](repl.md).** Uncertain locators, keyword sequencing, or "does this work against the running app?" get verified interactively instead of guessed, optionally with a visible browser to watch.
97+
- **Debugging a failing test via [`robot-debug`](robot-debug.md) — the agent's primary tool for any real test.** Whenever an existing test fails, won't run, or needs stepping through, the agent runs it under the command-line debugger: it pauses the real suite at a breakpoint (a `file:line`, a keyword, or the first uncaught failure), then reads the live call stack and per-frame variables and runs keywords in the paused context — capturing the actual state at the point of failure instead of re-running blindly or reasoning from the source. This is the default response to "why does this fail / step through it / what is `${x}` here?", and it is deliberately kept apart from the REPL below: the debugger acts on a test that *exists and runs*. Reaching for the REPL to fix a real test — instead of the debugger — is the single most common way the skill misfires.
98+
- **Live exploration via the [REPL](repl.md) — only when there is *no* test yet.** A narrower tool: uncertain locators or keyword sequencing get tried interactively (optionally against the running app with a visible browser) rather than guessed. The moment a real test is in play, it is the debugger's job, not the REPL's.
9699
- **Result inspection via [`results`](analyzing-results.md), not raw `output.xml`.** Finished runs are queried for bounded summaries, listings, traces, and diffs — including CI artifacts and a colleague's run — rather than loading a potentially huge XML file into the chat.
97100
- **Static checks via [`analyze code`](analyzing-code.md)** before a run, and **runs via [`robot`](cli.md#robot)** honoring the active profile.
98101

@@ -116,7 +119,7 @@ Keep it short and factual; this is standing context the agent reads on every req
116119
Independently of the chat plugins, the `robotcode` CLI detects when it is running inside an AI-agent session and adjusts its presentation defaults so captured output stays clean:
117120

118121
- ANSI **colors** and the **pager** are disabled, so escape sequences and paging controls don't leak into the agent's captured stdout.
119-
- [`robotcode repl`](repl.md) falls back to the **plain input backend**, so completion popups and prompt redraws don't interfere with stdin/stdout.
122+
- The [`robotcode robot-debug`](robot-debug.md) prompt and the [`robotcode repl`](repl.md) shell fall back to the **plain input backend**, so completion popups and prompt redraws don't interfere with stdin/stdout — captured debug output (`.where`, `.vars`, `.print`) stays clean.
120123

121124
Detection is based on environment markers set by popular tools — Claude Code, Cursor, GitHub Copilot (CLI and VS Code agent flow), Codex, OpenCode, Gemini CLI, and others — plus the generic `AI_AGENT` and `AGENT` conventions. A marker counts as active when it is present with any value other than empty or `0`.
122125

@@ -128,7 +131,7 @@ You rarely need to touch this, but every default can be overridden:
128131
| `ROBOTCODE_NO_AI_AGENT=1` | Force detection **off**. Wins over tool markers, loses to `ROBOTCODE_FORCE_AI_AGENT`. |
129132
| `--color` / `--no-color`, `NO_COLOR`, `FORCE_COLOR` | Decide coloring explicitly, regardless of detection. |
130133
| `--pager` / `--no-pager` | Decide paging explicitly. |
131-
| `--plain` / `--backend`, `ROBOTCODE_REPL_PLAIN`, `ROBOTCODE_REPL_BACKEND` | Decide the REPL backend explicitly. |
134+
| `--plain` / `--backend`, `ROBOTCODE_REPL_PLAIN`, `ROBOTCODE_REPL_BACKEND` | Decide the prompt backend explicitly — applies to both `robot-debug` and the REPL. |
132135

133136
Explicit flags and environment variables always win over auto-detection, so you can opt back into colored, paged, or full-featured output inside an agent session when you want it.
134137

@@ -138,7 +141,9 @@ Explicit flags and environment variables always win over auto-detection, so you
138141

139142
**The agent's answers don't match your project** — libraries or keywords it should see are reported as missing, or argument lists look wrong. It is most likely driving a `robotcode` from the wrong environment rather than the project's (see [Prerequisites](#prerequisites)). Have the agent run `robotcode discover info` to show which interpreter and versions it is using, and compare that against the project's environment.
140143

141-
**The agent writes a `.robot` file when you only wanted it to *do* something** — try a keyword, check a locator, watch a flow against the running app. This is the skill's most common misfire. Tell it not to write a test and to use the REPL instead ("don't write a test, just run it live"); it can save the session as a test afterwards if you ask. See [Interactive Robot Framework REPL](repl.md).
144+
**The agent re-runs a failing test over and over (or pastes it into the REPL) instead of debugging it.** This is the skill's single most common misfire. When a *real* test fails, the right tool is the [debugger](robot-debug.md), not blind re-runs or the REPL — it pauses the actual run at the failure and exposes the live stack and variables. Tell it to debug the test ("step through it with `robot-debug`", "break where it fails and show me the variables"). The REPL is for exploring when there's no test yet; the debugger is for a test that already runs.
145+
146+
**The agent writes a `.robot` file when you only wanted it to *do* something** — try a keyword or check a locator with no test yet. Tell it not to write a test and to use the REPL instead ("don't write a test, just run it live"); it can save the session as a test afterwards if you ask. See [Interactive Robot Framework REPL](repl.md). (If a test already exists, you want the debugger above, not the REPL.)
142147

143148
**A marketplace install behaves differently from the bundled VS Code plugin.** The two copies are versioned independently: the bundled one ships with the extension, the marketplace one updates through your agent's `plugin marketplace update`. If behavior diverges, update the marketplace copy — and make sure you aren't running both at once (see [Avoiding duplicates](#avoiding-duplicates)).
144149

@@ -147,5 +152,6 @@ Explicit flags and environment variables always win over auto-detection, so you
147152
- [Command Line Interface](cli.md) — every `robotcode` command the agent drives.
148153
- [Discovering Tests, Tasks and Suites](discovering-tests.md) — how project inventory is resolved.
149154
- [Analyzing Run Results](analyzing-results.md) — inspecting finished runs.
155+
- [Command-line debugging with `robotcode robot-debug`](robot-debug.md) — pausing a real run at a breakpoint to inspect the live stack and variables.
150156
- [Interactive Robot Framework REPL](repl.md) — the live keyword shell.
151157
- [`robotframework-agent-plugins`](https://github.com/robotcodedev/robotframework-agent-plugins) — the marketplace and plugin source.

0 commit comments

Comments
 (0)