feat(sources): source-authority scorer — P1 foundation (#111) by askalf · Pull Request #112 · askalf/deepdive

askalf · 2026-06-15T21:00:04Z

First commit of #111 (source-authority ranking). A pure, deterministic per-URL trust scorer — no LLM, no network — that answers the orthogonal question to the lexical verifier: not "does the answer match its source?" but "is that source trustworthy?".

scoreAuthority(url) → { tier: 'primary'|'reputable'|'unknown'|'low', score, reason }

Boost-led (near-zero false positives): *.gov/*.edu/*.mil + .ac.* TLDs; academia/standards (arxiv, aclanthology, ietf, w3.org, nature…); official docs (docs.*/developer.* prefixes + curated vendor set). reputable = wikipedia/SO/github.
unknown is neutral, not punished — a niche-but-legit source shouldn't be downranked for being unrecognized.
low = small, conservative seed denylist of content farms seen in the 2026-06-15 dogfood (precision over recall).

Pure + unit-tested (9 cases: TLDs, subdomain matching, denylist-beats-boost, gov/edu look-alike false-positive guards, unparseable input, monotonic scores). Validated on real dogfood domains: aiflashreport.com/gpt0x.com → low, aws.amazon.com/federalreserve.gov/redis.io → primary, slingacademy.com → unknown.

Not yet wired into the keep loop or output — this is the foundational unit. Next: keep-stage ranking (rank candidates by authority before the maxSources cut) and the source-trust signal in --json/footer. No version bump (not user-facing until wired).

Pure, deterministic per-URL trust scorer — the foundation for ranking which fetched sources to keep and for a source-trust signal distinct from citation support. Boost-led: gov/edu/standards/academia/official-docs = primary, wikipedia/SO/github = reputable; unknown is neutral (niche-but-legit not punished); low is a conservative seed denylist of observed content farms. Not yet wired into the keep loop — next commit. Validated on the 2026-06-15 dogfood domains: content farms -> low, AWS/Fed/redis docs -> primary.

…(P1 of #111) (#113) Wires the #112 scoreAuthority module — until now a pure, unused export — into the fetch keep-stage so authoritative/primary sources win the limited fetch slots ahead of whatever search happened to rank first. - source-authority.ts: add rankByAuthority(items, urlOf, mode) + SourceAuthorityMode. "prefer" (default) stable-sorts by authority and drops nothing; "strict" also drops known content farms, with a min-keep floor so a niche/recency round that only surfaces farms isn't zeroed out; "off" is identity. - agent.ts: rank this round's candidates before the slot-limited slice. - config.ts / cli.ts: --source-authority / DEEPDIVE_SOURCE_AUTHORITY (prefer|strict|off), default prefer; flag validated, env falls back. - 5 unit tests for rankByAuthority (prefer ordering + stability, strict drop + min-keep floor, off identity). Full build + config/agent/cli suites green (117 pass / 0 fail). Default is "prefer" per the issue — a reorder-only boost that drops nothing, so it cannot starve a query. P2 (trust signal in output/--json) and P3 (bench metric) follow.

Ships the source-authority layer (#112/#113/#114): deterministic, no-LLM domain scoring (scoreAuthority), keep-stage ranking that gives authoritative sources the limited fetch slots (--source-authority prefer|strict|off), and a source-trust signal in the footer + --json — orthogonal to citation support, so a fully-cited answer built on content farms is flagged instead of silently confident. Version bump + CHANGELOG + README doctor sample only.

askalf merged commit 953414c into master Jun 15, 2026
5 checks passed

askalf deleted the feat/source-authority-scorer branch June 15, 2026 21:01

This was referenced Jun 16, 2026

feat(sources): keep-stage authority ranking (P1 of #111) #113

Merged

feat(sources): source-trust signal in output + --json (P2 of #111) #114

Merged

askalf mentioned this pull request Jun 18, 2026

release: 0.26.0 — source authority, a second trust axis (#111) #115

Merged

askalf mentioned this pull request Jun 18, 2026

bench: source-authority distribution metric (P3 of #111) #116

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(sources): source-authority scorer — P1 foundation (#111)#112

feat(sources): source-authority scorer — P1 foundation (#111)#112
askalf merged 1 commit into
masterfrom
feat/source-authority-scorer

askalf commented Jun 15, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

askalf commented Jun 15, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant