docs: rewrite README — sovereignty + cost-arbitrage angle by askalf · Pull Request #2 · askalf/deepdive

askalf · 2026-04-23T00:25:19Z

Problem

The v0.1.0 README read like a CLI manual. `--deep` (the v0.2.0 headline feature) and the dario/Max economics (the reason this exists as a package and not a shell script) were footnotes.

Revised angle

Two pillars, both above the fold:

1. Sovereignty — "What you keep"

Hero paragraph opens with "Your machine. Your LLM subscription. Your search backend. Your cited report." Expands into four things every hosted research tool takes away:

Your data. No queries, URLs, or page content leave your laptop except the specific searches and fetches the planner chose. Verifiable with `lsof -i`.
Your model. Swap `--model` between Sonnet / Opus / a LiteLLM-routed local model with one flag.
Your search backend. DDG / SearXNG / Brave / Tavily, one flag each.
Your depth. Hosted tools cap `--deep` because unit economics demand it. You don't.

2. Cost-arbitrage — "What you stop paying for"

Explicit per-query / per-month cost table:

Path	Per-query	Per-month (10 deep queries)	Data local?
Per-token API (Opus)	~$2–$8	~$20–$80	✅
Per-token API (Sonnet)	~$0.30–$1.20	~$3–$12	✅
Perplexity Pro	capped	$20/mo	❌
OpenAI Deep Research	capped	$20/mo	❌
Gemini Deep Research	capped	$20/mo	❌
deepdive + dario + Max	$0	$0 (in Max)	✅

Argument: deep-research workload is exactly the token shape Max was priced for. Paying a hosted tool on top is paying twice for LLM calls you already bought.

3. `--deep` reframed

New paragraph in the loop section: "the critic loop is the axis hosted tools cap on, per-query unit economics force them to ship fixed depth, on your own subscription the only cap is the one you set." Makes `--deep` load-bearing instead of just a flag.

Dropped

The old 7-column comparison table (Perplexity / OpenAI DR / Gemini DR vs deepdive across model choice, search backend, etc.). It did the same work the two pillar sections now do, less focused.

Test plan

Markdown renders correctly
No broken anchors
Two commits on branch — first pass with Perplexity-framing is in git history if the sovereignty/cost angle needs revisiting
Docs-only, no release needed

The old README opened with the mechanical pipeline (plan → search → fetch → extract → synthesize) and treated --deep and the dario pairing as footnotes. Neither the headline feature of v0.2.0 nor the reason deepdive exists at all were above the fold. The rewrite: - Opens with an outcome statement ("open-source Perplexity for your own machine, running on your own Claude Max subscription") and names the specific thing the critic loop gives you that hosted tools cap away. - Adds a "The point" section that frames deepdive as the consumer counterpart to dario's routing layer — same product, two pieces. The deep-research workload is 50k-200k tokens per query, which is the exact shape Claude Max was priced for and the exact shape per-token API pricing makes painful. - Adds a comparison table vs Perplexity / OpenAI Deep Research / Gemini Deep Research across seven axes (who runs the loop, model choice, search backend, who sees queries, depth cap, billing path, source). - Adds a "--deep loop" section with an ASCII diagram and a concrete cost breakdown so the feature is load-bearing rather than a caveat. - Adds a realistic sample output so the thing feels real. - Cuts the pipeline-description and architecture-principles sections (moved to CLAUDE.md in earlier commits). - Trims the flag table to "common flags" with a one-line "why" per flag and defers exhaustive docs to --help.

…ming Revised angle: frame deepdive around what a hosted research tool takes from you and gives in exchange. Two pillars: 1. **Sovereignty.** Hero paragraph starts with "Your machine. Your LLM subscription. Your search backend. Your cited report." — four "your" things that all hosted tools quietly take away. "What you keep" section expands each one. Trust-and-transparency table gains an lsof -i note for verifiable claims. 2. **Cost-arbitrage.** "What you stop paying for" section with an explicit per-query / per-month cost table comparing: per-token Opus, per-token Sonnet, Perplexity Pro, OpenAI Deep Research (ChatGPT Plus), Gemini Deep Research (AI Advanced), and deepdive + dario + Claude Max. The argument — deep research is exactly the workload Max was priced for, so layering a hosted tool on top is paying twice for LLM calls you already bought — is stated directly. --deep section explains why the critic loop is the axis hosted tools cap on (unit economics) and why on your own subscription the only cap is the one you set. That reframes --deep as the load-bearing feature instead of a flag in a table. Dropped the old "vs Perplexity/OpenAI DR/Gemini DR" 7-column comparison table — it was doing the same work the two pillar sections now do, with less focus. Kept the sample output, the 60-second quickstart, the common-flags table, the search adapter matrix, and the caching note.

Clean sweep of the security scan. Each fix is defensive — none of these were exploitable in deepdive's context (search snippets aren't HTML-rendered; URLs are not attacker-controlled sizes) — but silencing CodeQL matters for the paste-able health signal, and the new implementations are genuinely better. Alerts fixed: - #4, #5, #6, #7 js/polynomial-redos — `/\/+$/` and `/#.*$/` end-anchored patterns replaced with non-regex string walks in a new src/url-util.ts (trimTrailingSlashes, stripHashFragment, dedupeKey). Provably linear-time, with a pathological-input test that asserts sub-100ms runtime on 100k trailing slashes. - #3 js/incomplete-url-substring-sanitization — in the DDG redirect unwrap, `u.hostname.endsWith("duckduckgo.com")` would also match `evil-duckduckgo.com`. Tightened to `hostname === "duckduckgo.com" || hostname.endsWith(".duckduckgo.com")`. - #2 js/double-escaping — decodeHtmlEntities was chaining 8 sequential `.replace()` calls, so `&#39;` would double-decode: first pass produces `'`, second pass expands to `'`. Replaced with a single-pass tokenizer that resolves each `&...;` exactly once, so `&#39;` now correctly stays as the literal `'`. Also added support for `'` and `&#xHEX;` forms while I was in there. - #1 js/incomplete-multi-character-sanitization — stripTags matched `/<[^>]+>/g`, which leaves malformed partials like `<scrip` (no closing `>`) in the output. Now does a two-step strip: well-formed tags replaced with a space, then any remaining `<` is also replaced with a space, so no tag-opener character can ever leak downstream. Test coverage added: - test/url-util.test.mjs — 10 assertions including a linear-time assert - test/html-sanitize.test.mjs — 10 assertions; the `<scrip` partial and the double-decode regression are both pinned. 116 tests total, all pass (up from 96). Co-authored-by: askalf <263217947+askalf@users.noreply.github.com>

askalf added 2 commits April 22, 2026 20:24

askalf changed the title ~~docs: rewrite README to lead with the product, not the pipeline~~ docs: rewrite README — sovereignty + cost-arbitrage angle Apr 23, 2026

askalf merged commit 1c4208f into master Apr 23, 2026
4 checks passed

askalf deleted the docs/readme-rewrite branch April 23, 2026 00:47

askalf mentioned this pull request Apr 23, 2026

fix: address 7 CodeQL high-severity alerts #3

Merged

4 tasks

askalf mentioned this pull request Jun 15, 2026

synth prompt tuning PARKED: bench can't resolve prompt deltas while factual-lookup hedges nondeterministically (stock 1/3) #97

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: rewrite README — sovereignty + cost-arbitrage angle#2

docs: rewrite README — sovereignty + cost-arbitrage angle#2
askalf merged 2 commits into
masterfrom
docs/readme-rewrite

askalf commented Apr 23, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

askalf commented Apr 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Revised angle

1. Sovereignty — "What you keep"

2. Cost-arbitrage — "What you stop paying for"

3. `--deep` reframed

Dropped

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

askalf commented Apr 23, 2026 •

edited

Loading