Skip to content

v0.2.0 — --deep loop, concurrent fetch, on-disk cache, --json output#1

Merged
askalf merged 1 commit into
masterfrom
v0.2.0-deep-concurrent-cache
Apr 23, 2026
Merged

v0.2.0 — --deep loop, concurrent fetch, on-disk cache, --json output#1
askalf merged 1 commit into
masterfrom
v0.2.0-deep-concurrent-cache

Conversation

@askalf

@askalf askalf commented Apr 23, 2026

Copy link
Copy Markdown
Owner

Summary

Biggest v0.1.0 gaps, closed:

  1. --deep[=N] critic-driven iterative research — after the first synthesis, an LLM critic reviews the draft, names the gaps, and the loop re-runs up to N more rounds. Bare --deep = 2. Critic can say done early.
  2. --concurrency=N parallel fetches (default 4) via new src/concurrency.ts worker-pool. ~Nx wall-time reduction on the fetch phase.
  3. Per-URL on-disk cache at `~/.deepdive/cache/` (1h TTL, atomic writes). Cache-hit-only runs never launch Chromium. `--no-cache` to disable.
  4. `--json` output mode — `{question, plan, rounds, sources, answer, usage}`. Pipeable.

New surface (library users)

  • `AgentConfig.browserFactory` — inject a mock `BrowserLike` in tests / custom deployments
  • `AgentResult.rounds` — per-round trace with queries, candidates, fetches, kept count, critic verdict
  • `AgentResult.usage.cacheHits`
  • `AgentConfig.cache` (optional)
  • `AgentConfig.deepRounds`, `AgentConfig.concurrency`
  • New events: `round.start`, `critique.start`, `critique.done`; `fetch.start`/`fetch.done` gained `cached: boolean`

Not yet

  • PDF extraction — deferred to v0.3.0 (deserves its own scope)
  • Streaming output during synthesis — also v0.3.0 material

Test plan

  • `npm run build` — clean under `strict: true`
  • `npm test` — 96 pass, 0 fail (up from ~53)
  • 42 new assertions: cache round-trip/TTL/atomicity (6), concurrency cap/order/abort (8), parseCritique (8), new CLI flags (8), new config options (7), agent integration test (6 — single-pass, deep loop, early-done, cache reuse, maxSources cap, broken-fetch resilience)
  • Manual verification: mock LLM HTTP server + mock search + mock browser run the full agent pipeline through the deep loop in ~50ms

Risk

  • `AgentConfig` gained required `deepRounds` and `concurrency` fields. Anyone constructing the config manually in TS will get a type error. Anyone going through `resolveConfig` picks up defaults for free. `CHANGELOG.md` calls this out.

…-json output

Adds the four biggest things v0.1.0 was missing:

1. `--deep[=N]` critic-driven iterative research. After the first synthesis
   an LLM critic reviews the draft, names the gaps, and proposes follow-up
   queries; the loop re-runs up to N more rounds. Bare `--deep` = 2 rounds.
   Critic can declare done early.
2. `--concurrency=N` parallel page fetches (default 4) via a new worker-pool
   in src/concurrency.ts. Cuts the fetch phase wall-time ~N× on a mixed
   batch of fast and slow pages.
3. Per-URL on-disk page cache at ~/.deepdive/cache/ (1h default TTL, atomic
   .tmp + rename writes). Cache hits skip the browser entirely — an
   all-cached re-run never launches Chromium. Disable with --no-cache.
4. `--json` structured output mode: {question, plan, rounds, sources, answer,
   usage}. Pipeable into jq and other tools.

Bonus surface for library users: AgentConfig.browserFactory for injecting a
mock BrowserLike, AgentResult.rounds with per-round traces + the critic's
verdict, AgentResult.usage.cacheHits, round.start / critique.start /
critique.done events, cached: boolean on fetch events.

Tests: 42 new assertions. Cache round-trip + TTL + atomicity (6), concurrency
cap + ordering + abort (8), parseCritique (8), new CLI flags (8), new config
options (7), full agent integration test with a mock LLM HTTP server + mock
search + mock browser covering single-pass, deep loop, early critic
termination, cache reuse, maxSources cap, and broken-fetch resilience (6).
96 total, all pass.
@askalf askalf merged commit 4a53c7e into master Apr 23, 2026
3 checks passed
@askalf askalf deleted the v0.2.0-deep-concurrent-cache branch April 23, 2026 00:47
askalf added a commit that referenced this pull request Apr 23, 2026
Clean sweep of the security scan. Each fix is defensive — none of these
were exploitable in deepdive's context (search snippets aren't
HTML-rendered; URLs are not attacker-controlled sizes) — but silencing
CodeQL matters for the paste-able health signal, and the new
implementations are genuinely better.

Alerts fixed:
- #4, #5, #6, #7 js/polynomial-redos — `/\/+$/` and `/#.*$/`
  end-anchored patterns replaced with non-regex string walks in a new
  src/url-util.ts (trimTrailingSlashes, stripHashFragment, dedupeKey).
  Provably linear-time, with a pathological-input test that asserts
  sub-100ms runtime on 100k trailing slashes.

- #3 js/incomplete-url-substring-sanitization — in the DDG redirect
  unwrap, `u.hostname.endsWith("duckduckgo.com")` would also match
  `evil-duckduckgo.com`. Tightened to
  `hostname === "duckduckgo.com" || hostname.endsWith(".duckduckgo.com")`.

- #2 js/double-escaping — decodeHtmlEntities was chaining 8 sequential
  `.replace()` calls, so `'` would double-decode: first pass
  produces `'`, second pass expands to `'`. Replaced with a
  single-pass tokenizer that resolves each `&...;` exactly once, so
  `'` now correctly stays as the literal `'`. Also added
  support for `'` and `&#xHEX;` forms while I was in there.

- #1 js/incomplete-multi-character-sanitization — stripTags matched
  `/<[^>]+>/g`, which leaves malformed partials like `<scrip` (no
  closing `>`) in the output. Now does a two-step strip: well-formed
  tags replaced with a space, then any remaining `<` is also replaced
  with a space, so no tag-opener character can ever leak downstream.

Test coverage added:
- test/url-util.test.mjs — 10 assertions including a linear-time assert
- test/html-sanitize.test.mjs — 10 assertions; the `<scrip` partial
  and the double-decode regression are both pinned.

116 tests total, all pass (up from 96).

Co-authored-by: askalf <263217947+askalf@users.noreply.github.com>
askalf added a commit that referenced this pull request Jun 12, 2026
…never die (#84)

The v0.21.0 baseline bench (bench/results/, committed here) made the
case: with DuckDuckGo rate-limiting the box's IP, 5/6 default-config
questions died with zero sources (exit 3) while the one multi-backend
question passed with 12 sources and 0.86 citation support. A default
run dying on a single backend's throttle is the #1 reliability gap vs
hosted competitors, who never return nothing.

- searchFallback now DEFAULTS to "wikipedia" — keyless, reliable, and
  never shares the primary's failure mode. It only engages when a
  round's primary searches produced ZERO candidates, so healthy runs
  are byte-identical to before. "none"/"off" (or blank) disables.
- The fallback engaging changes where the answer's sources come from,
  so the notice now prints to stderr even WITHOUT --verbose — a
  degraded run can never be mistaken for a normal one.

Live proof under the ongoing DDG rate limit: the comparison question
that exited 3 minutes earlier now completes via the fallback — 3
sources, 9/9 citations supported, $0.044, with the stderr notice.

682 tests: default resolution, none/off/blank disable, overrides.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant