You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Adds the four biggest things v0.1.0 was missing:
1. `--deep[=N]` critic-driven iterative research. After the first synthesis
an LLM critic reviews the draft, names the gaps, and proposes follow-up
queries; the loop re-runs up to N more rounds. Bare `--deep` = 2 rounds.
Critic can declare done early.
2. `--concurrency=N` parallel page fetches (default 4) via a new worker-pool
in src/concurrency.ts. Cuts the fetch phase wall-time ~N× on a mixed
batch of fast and slow pages.
3. Per-URL on-disk page cache at ~/.deepdive/cache/ (1h default TTL, atomic
.tmp + rename writes). Cache hits skip the browser entirely — an
all-cached re-run never launches Chromium. Disable with --no-cache.
4. `--json` structured output mode: {question, plan, rounds, sources, answer,
usage}. Pipeable into jq and other tools.
Bonus surface for library users: AgentConfig.browserFactory for injecting a
mock BrowserLike, AgentResult.rounds with per-round traces + the critic's
verdict, AgentResult.usage.cacheHits, round.start / critique.start /
critique.done events, cached: boolean on fetch events.
Tests: 42 new assertions. Cache round-trip + TTL + atomicity (6), concurrency
cap + ordering + abort (8), parseCritique (8), new CLI flags (8), new config
options (7), full agent integration test with a mock LLM HTTP server + mock
search + mock browser covering single-pass, deep loop, early critic
termination, cache reuse, maxSources cap, and broken-fetch resilience (6).
96 total, all pass.
Co-authored-by: askalf <263217947+askalf@users.noreply.github.com>
-`--deep[=N]` iterative research loop. After the first synthesis a "critic" LLM call reviews the draft, names the gaps, and proposes follow-up queries; the loop re-runs up to N more rounds (bare `--deep` defaults to 2). Critic can declare the answer complete to terminate early. `critique()` / `parseCritique()` are exported from `src/plan.ts` for programmatic use.
15
+
-`--concurrency=N` parallel page fetches (default: 4). The agent now uses a worker-pool pattern (`src/concurrency.ts`) instead of a serial loop, cutting fetch phase wall-time roughly N× on a mix of fast and slow pages.
16
+
- Per-URL on-disk cache at `~/.deepdive/cache/` (configurable via `DEEPDIVE_CACHE_DIR`). Atomic `.tmp` + `rename` writes. TTL default 1 hour (`--cache-ttl-ms=<ms>`). Disable with `--no-cache` or `DEEPDIVE_NO_CACHE=1`. Cache hits skip the browser entirely — an all-cached run never launches Chromium.
17
+
-`--json` output mode. Prints `{question, plan, rounds, sources, answer, usage}` instead of markdown — pipeable into `jq` and other tools. `DEEPDIVE_JSON=1` env var also works.
18
+
-`AgentConfig.browserFactory` — optional factory for injecting a mock `BrowserLike` in tests and custom deployments. The default factory returns a real `BrowserSession`, so existing callers see no behavior change.
19
+
-`AgentResult.rounds` — per-round trace with queries, candidates found, fetches, kept-count, and the critic's verdict.
20
+
-`AgentResult.usage.cacheHits` — how many fetches hit the cache this run.
21
+
- New agent events: `round.start`, `critique.start`, `critique.done`. Existing `fetch.start` / `fetch.done` gained a `cached: boolean` field.
22
+
- Tests: cache round-trip + TTL + atomicity, concurrency cap + ordering + abort, parseCritique, new CLI flag coverage, full agent loop integration test with a mock LLM HTTP server, mock search, and mock browser — 42 new assertions (96 total, up from ~53).
23
+
24
+
### Changed
25
+
- README — new flag table covering `--deep`, `--concurrency`, `--no-cache`, `--json`; examples for deep mode and JSON piping; library-mode snippet updated to wire cache + deepRounds. Also: demoted promotional "Claude Max / Pro" phrasing to "Claude Max" to match dario's README after the 2026-04-21 Anthropic Pro/CC incident.
26
+
-`AgentConfig` is additive — new required fields `deepRounds` and `concurrency` (non-breaking: `resolveConfig` sets sensible defaults so library callers that go through it pick them up for free).
27
+
9
28
## [0.1.0] — 2026-04-21
10
29
11
30
Initial scaffolding. One-shot research agent: plan → search → fetch → extract → synthesize → cited markdown.
Copy file name to clipboardExpand all lines: CLAUDE.md
+4-3Lines changed: 4 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -7,16 +7,17 @@ This file is the project-local instructions for Claude Code and similar agents w
7
7
A local research agent. CLI entry point `deepdive`. Given a question, it:
8
8
1. Asks an LLM to decompose the question into 3–5 sub-queries.
9
9
2. Runs each sub-query through a pluggable search adapter.
10
-
3. Fetches each result page through a Playwright-driven headless Chromium.
10
+
3. Fetches each result page through a Playwright-driven headless Chromium (parallelized, optionally cached to `~/.deepdive/cache/`).
11
11
4. Extracts main content, caps per-source word count.
12
12
5. Asks an LLM to synthesize a cited markdown answer.
13
+
6. (Optional, `--deep`) A critic LLM reviews the draft, names the gaps, and the loop re-runs until the critic says done or N rounds elapse.
13
14
14
15
Every LLM call goes to an Anthropic-compat endpoint. Default target is [dario](https://github.com/askalf/dario) at `http://localhost:3456`. Any Anthropic-compat URL works — Anthropic directly, a self-hosted LiteLLM, another proxy.
15
16
16
17
## Architecture principles
17
18
18
19
-**One runtime dependency.**`playwright` is required because we actually need to render JS-heavy pages. Nothing else. No `axios`, `node-fetch`, `readability`, `jsdom`, `chalk`, `yargs`, `zod`. Node built-ins and hand-rolled code.
19
-
-**Pure decision functions.** Anything with logic goes in a module that can be tested without a browser or an LLM: `parsePlan`, `parseArgs`, `resolveConfig`, `parsePositiveInt`, `extractContent`, `dedupeByUrl`, `parseDuckDuckGoHTML`, `buildSourceTable`, `renderAnswerMarkdown`, `buildSourcePacket`.
20
+
-**Pure decision functions.** Anything with logic goes in a module that can be tested without a browser or an LLM: `parsePlan`, `parseCritique`, `parseArgs`, `resolveConfig`, `parsePositiveInt`, `parseNonNegativeInt`, `extractContent`, `dedupeByUrl`, `parseDuckDuckGoHTML`, `buildSourceTable`, `renderAnswerMarkdown`, `buildSourcePacket`, `cacheKey`, `runConcurrent`.
20
21
-**I/O at the edges.**`cli.ts`, `browser.ts`, `llm.ts`, and the individual search adapters touch the network or disk. Everything else is synchronous over strings and objects.
21
22
-**Events, not prints.** The agent emits structured events via `onEvent`; the CLI renders them. Do not `console.log` from inside `src/agent.ts` or any library module.
22
23
-**Hand-rolled regex parsers are fine** for DDG HTML. If the parser breaks, fix the parser. Do not reach for `cheerio`.
@@ -40,7 +41,7 @@ Every LLM call goes to an Anthropic-compat endpoint. Default target is [dario](h
40
41
## What not to do
41
42
42
43
- Don't add a web UI. The product is a CLI; a UI is a separate product.
43
-
- Don't add a research loop (read answer, decide if more searches needed) without putting it behind a `--deep`flag. v1 is one-shot on purpose.
44
+
- Don't promote the research loop to the default path — `--deep` is opt-in for a reason (cost / latency predictability).
44
45
- Don't reach for LangChain / LlamaIndex / anything that turns 300 lines into 3,000 and adds 40 deps.
45
46
- Don't bundle SearXNG. Ship it as an adapter that points at a URL the user provides.
Copy file name to clipboardExpand all lines: README.md
+39-7Lines changed: 39 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,6 @@
1
1
<palign="center">
2
2
<h1align="center">deepdive</h1>
3
-
<palign="center"><strong>A local research agent. One command, cited answer.</strong><br>Decomposes your question into sub-queries, runs web searches, fetches pages through a real headless browser, and hands everything to an LLM that writes a cited markdown report. Every LLM call goes through your own router — default target is <ahref="https://github.com/askalf/dario">dario</a> at <code>localhost:3456</code>, so synthesis runs on your Claude Max / Pro subscription, your own OpenAI key, or any local model. Any Anthropic-compat endpoint works.</p>
3
+
<palign="center"><strong>A local research agent. One command, cited answer.</strong><br>Decomposes your question into sub-queries, runs web searches, fetches pages through a real headless browser, and hands everything to an LLM that writes a cited markdown report. Every LLM call goes through your own router — default target is <ahref="https://github.com/askalf/dario">dario</a> at <code>localhost:3456</code>, so synthesis runs on your Claude Max subscription, your own OpenAI key, or any local model. Any Anthropic-compat endpoint works.</p>
4
4
</p>
5
5
6
6
<palign="center"><em>Zero hosted dependencies. MIT. Independent, unofficial, third-party — see <ahref="DISCLAIMER.md">DISCLAIMER.md</a>.</em></p>
@@ -12,7 +12,7 @@
12
12
```bash
13
13
# 1. Have dario running (or any Anthropic-compat endpoint at a local URL).
14
14
# See: https://github.com/askalf/dario
15
-
dario proxy # http://localhost:3456, routes to Claude Max / OpenAI / etc.
15
+
dario proxy # http://localhost:3456, routes to Claude Max, OpenAI, etc.
1.**Plan.** LLM decomposes your question into 3–5 searchable sub-queries.
33
33
2.**Search.** DuckDuckGo HTML by default (no API key). Pluggable: `--search=searxng|brave|tavily` with your own endpoint or key.
34
-
3.**Fetch.** Playwright-driven Chromium renders each result page (JS-rendered SPAs included).
34
+
3.**Fetch.** Playwright-driven Chromium renders each result page (JS-rendered SPAs included). Parallelized via `--concurrency`, cached to `~/.deepdive/cache/` (1h TTL by default) so re-running is free.
35
35
4.**Extract.** Boilerplate stripped, main content capped to a word budget.
36
36
5.**Synthesize.** LLM writes the answer with inline `[N]` citations referencing the source list.
37
+
6.**Critique (optional, `--deep`).** LLM reviews its own draft, names the gaps, proposes follow-up queries, loop re-runs until the critic says done or `--deep=N` rounds elapse.
37
38
38
39
---
39
40
40
41
## Why this exists
41
42
42
43
Every hosted research tool (Perplexity, OpenAI Deep Research, Gemini Deep Research) sends your queries to someone else's server, charges per query, and gives you no say in which model synthesizes the answer or which sources get read. deepdive is the self-hosted alternative: your machine, your LLM subscription, your model choice, your search backend.
43
44
44
-
Pair it with [dario](https://github.com/askalf/dario) and every research query routes through your Claude Max / Pro subscription instead of per-token API pricing — a single deep query can be 50k–200k tokens, which is exactly the workload subscription billing was built for.
45
+
Pair it with [dario](https://github.com/askalf/dario) and every research query routes through your Claude Max subscription instead of per-token API pricing — a single deep query can be 50k–200k tokens, which is exactly the workload subscription billing was built for.
45
46
46
47
---
47
48
@@ -84,20 +85,47 @@ Run `deepdive --help` for the full list. The ones you'll actually use:
84
85
|`--results-per-query=<n>`|`5`| Candidates pulled per sub-query |
85
86
|`--max-words-per-source=<n>`|`2000`| Per-source content cap before synthesis |
86
87
|`--timeout-ms=<ms>`|`30000`| Per-fetch timeout |
87
-
|`--out=<path>`| — | Also write markdown to file |
88
+
|`--deep[=<n>]`| off (bare `--deep` = 2) | Critic-driven iterative research: after the initial answer, LLM names gaps and proposes follow-up queries for up to N more rounds |
|`--out=<path>`| — | Also write output (markdown or JSON) to file |
88
94
|`--verbose`, `-v`| — | Stream progress events to stderr |
89
95
90
-
All flags mirror to `DEEPDIVE_*` env vars (e.g. `DEEPDIVE_MODEL`, `DEEPDIVE_MAX_SOURCES`). CLI flags win over env vars.
96
+
All flags mirror to `DEEPDIVE_*` env vars (e.g. `DEEPDIVE_MODEL`, `DEEPDIVE_MAX_SOURCES`, `DEEPDIVE_DEEP_ROUNDS`, `DEEPDIVE_CONCURRENCY`). CLI flags win over env vars.
97
+
98
+
### Example: deep iterative research
99
+
100
+
```bash
101
+
deepdive "compare bun's TLS ClientHello to node's" --deep=3 --verbose --out=tls.md
102
+
```
103
+
104
+
With `--deep=3`, after the first synthesis the critic can run up to three more rounds of "look at the draft, find what's missing, search for it, re-synthesize." Each round can add up to `--max-sources` new pages, so plan the cap. The loop stops early as soon as the critic says the draft is complete.
105
+
106
+
### Example: JSON output for piping into other tools
Copy file name to clipboardExpand all lines: package.json
+1-1Lines changed: 1 addition & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,6 @@
1
1
{
2
2
"name": "@askalf/deepdive",
3
-
"version": "0.1.0",
3
+
"version": "0.2.0",
4
4
"description": "A local research agent. One command, cited answer. Routes every LLM call through your own proxy (dario, Anthropic-compat, OpenAI-compat). Headless browser + pluggable search + multi-provider LLM — zero hosted dependencies.",
0 commit comments