v0.2.0 — --deep loop, concurrent fetch, on-disk cache, --json output#1
Merged
Conversation
…-json output
Adds the four biggest things v0.1.0 was missing:
1. `--deep[=N]` critic-driven iterative research. After the first synthesis
an LLM critic reviews the draft, names the gaps, and proposes follow-up
queries; the loop re-runs up to N more rounds. Bare `--deep` = 2 rounds.
Critic can declare done early.
2. `--concurrency=N` parallel page fetches (default 4) via a new worker-pool
in src/concurrency.ts. Cuts the fetch phase wall-time ~N× on a mixed
batch of fast and slow pages.
3. Per-URL on-disk page cache at ~/.deepdive/cache/ (1h default TTL, atomic
.tmp + rename writes). Cache hits skip the browser entirely — an
all-cached re-run never launches Chromium. Disable with --no-cache.
4. `--json` structured output mode: {question, plan, rounds, sources, answer,
usage}. Pipeable into jq and other tools.
Bonus surface for library users: AgentConfig.browserFactory for injecting a
mock BrowserLike, AgentResult.rounds with per-round traces + the critic's
verdict, AgentResult.usage.cacheHits, round.start / critique.start /
critique.done events, cached: boolean on fetch events.
Tests: 42 new assertions. Cache round-trip + TTL + atomicity (6), concurrency
cap + ordering + abort (8), parseCritique (8), new CLI flags (8), new config
options (7), full agent integration test with a mock LLM HTTP server + mock
search + mock browser covering single-pass, deep loop, early critic
termination, cache reuse, maxSources cap, and broken-fetch resilience (6).
96 total, all pass.
4 tasks
askalf
added a commit
that referenced
this pull request
Apr 23, 2026
Clean sweep of the security scan. Each fix is defensive — none of these were exploitable in deepdive's context (search snippets aren't HTML-rendered; URLs are not attacker-controlled sizes) — but silencing CodeQL matters for the paste-able health signal, and the new implementations are genuinely better. Alerts fixed: - #4, #5, #6, #7 js/polynomial-redos — `/\/+$/` and `/#.*$/` end-anchored patterns replaced with non-regex string walks in a new src/url-util.ts (trimTrailingSlashes, stripHashFragment, dedupeKey). Provably linear-time, with a pathological-input test that asserts sub-100ms runtime on 100k trailing slashes. - #3 js/incomplete-url-substring-sanitization — in the DDG redirect unwrap, `u.hostname.endsWith("duckduckgo.com")` would also match `evil-duckduckgo.com`. Tightened to `hostname === "duckduckgo.com" || hostname.endsWith(".duckduckgo.com")`. - #2 js/double-escaping — decodeHtmlEntities was chaining 8 sequential `.replace()` calls, so `&#39;` would double-decode: first pass produces `'`, second pass expands to `'`. Replaced with a single-pass tokenizer that resolves each `&...;` exactly once, so `&#39;` now correctly stays as the literal `'`. Also added support for `'` and `&#xHEX;` forms while I was in there. - #1 js/incomplete-multi-character-sanitization — stripTags matched `/<[^>]+>/g`, which leaves malformed partials like `<scrip` (no closing `>`) in the output. Now does a two-step strip: well-formed tags replaced with a space, then any remaining `<` is also replaced with a space, so no tag-opener character can ever leak downstream. Test coverage added: - test/url-util.test.mjs — 10 assertions including a linear-time assert - test/html-sanitize.test.mjs — 10 assertions; the `<scrip` partial and the double-decode regression are both pinned. 116 tests total, all pass (up from 96). Co-authored-by: askalf <263217947+askalf@users.noreply.github.com>
askalf
added a commit
that referenced
this pull request
Jun 12, 2026
…never die (#84) The v0.21.0 baseline bench (bench/results/, committed here) made the case: with DuckDuckGo rate-limiting the box's IP, 5/6 default-config questions died with zero sources (exit 3) while the one multi-backend question passed with 12 sources and 0.86 citation support. A default run dying on a single backend's throttle is the #1 reliability gap vs hosted competitors, who never return nothing. - searchFallback now DEFAULTS to "wikipedia" — keyless, reliable, and never shares the primary's failure mode. It only engages when a round's primary searches produced ZERO candidates, so healthy runs are byte-identical to before. "none"/"off" (or blank) disables. - The fallback engaging changes where the answer's sources come from, so the notice now prints to stderr even WITHOUT --verbose — a degraded run can never be mistaken for a normal one. Live proof under the ongoing DDG rate limit: the comparison question that exited 3 minutes earlier now completes via the fallback — 3 sources, 9/9 citations supported, $0.044, with the stderr notice. 682 tests: default resolution, none/off/blank disable, overrides.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Biggest v0.1.0 gaps, closed:
--deep[=N]critic-driven iterative research — after the first synthesis, an LLM critic reviews the draft, names the gaps, and the loop re-runs up to N more rounds. Bare--deep= 2. Critic can say done early.--concurrency=Nparallel fetches (default 4) via newsrc/concurrency.tsworker-pool. ~Nx wall-time reduction on the fetch phase.New surface (library users)
Not yet
Test plan
Risk