Skip to content

fix: address 7 CodeQL high-severity alerts#3

Merged
askalf merged 1 commit into
masterfrom
fix/codeql-alerts
Apr 23, 2026
Merged

fix: address 7 CodeQL high-severity alerts#3
askalf merged 1 commit into
masterfrom
fix/codeql-alerts

Conversation

@askalf

@askalf askalf commented Apr 23, 2026

Copy link
Copy Markdown
Owner

Summary

Clean sweep of the 7 open CodeQL alerts. None were exploitable in context (search snippets aren't HTML-rendered, URLs aren't attacker-controlled sizes) but silencing CodeQL matters for the paste-able health signal, and the new implementations are genuinely better.

Alert-by-alert

# Rule File Fix
#4-7 js/polynomial-redos llm.ts, search.ts, searxng.ts End-anchored /\/+$/ and /#.*$/ replaced with non-regex string walks in a new src/url-util.ts (trimTrailingSlashes, stripHashFragment, dedupeKey). Provably linear-time.
#3 js/incomplete-url-substring-sanitization search/duckduckgo.ts hostname.endsWith('duckduckgo.com') would match evil-duckduckgo.com. Tightened to === 'duckduckgo.com' || endsWith('.duckduckgo.com').
#2 js/double-escaping search/duckduckgo.ts Chained 8 sequential .replace() calls meant ' would double-decode to '. Replaced with a single-pass tokenizer — each &...; resolved exactly once.
#1 js/incomplete-multi-character-sanitization search/duckduckgo.ts stripTags left malformed partials like <scrip (no closing >) intact. Now strips well-formed tags, then drops any remaining < so no tag-opener can leak.

Test plan

  • npm run build — clean under strict: true
  • npm test — 116 pass, 0 fail (up from 96)
  • New: test/url-util.test.mjs — 10 assertions including a linear-time assertion on 100k-char pathological input (sub-100ms)
  • New: test/html-sanitize.test.mjs — 10 assertions; the <scrip partial and the &amp;#39; double-decode regressions are both pinned

Clean sweep of the security scan. Each fix is defensive — none of these
were exploitable in deepdive's context (search snippets aren't
HTML-rendered; URLs are not attacker-controlled sizes) — but silencing
CodeQL matters for the paste-able health signal, and the new
implementations are genuinely better.

Alerts fixed:
- #4, #5, #6, #7 js/polynomial-redos — `/\/+$/` and `/#.*$/`
  end-anchored patterns replaced with non-regex string walks in a new
  src/url-util.ts (trimTrailingSlashes, stripHashFragment, dedupeKey).
  Provably linear-time, with a pathological-input test that asserts
  sub-100ms runtime on 100k trailing slashes.

- #3 js/incomplete-url-substring-sanitization — in the DDG redirect
  unwrap, `u.hostname.endsWith("duckduckgo.com")` would also match
  `evil-duckduckgo.com`. Tightened to
  `hostname === "duckduckgo.com" || hostname.endsWith(".duckduckgo.com")`.

- #2 js/double-escaping — decodeHtmlEntities was chaining 8 sequential
  `.replace()` calls, so `&amp;#39;` would double-decode: first pass
  produces `&#39;`, second pass expands to `'`. Replaced with a
  single-pass tokenizer that resolves each `&...;` exactly once, so
  `&amp;#39;` now correctly stays as the literal `&#39;`. Also added
  support for `&apos;` and `&#xHEX;` forms while I was in there.

- #1 js/incomplete-multi-character-sanitization — stripTags matched
  `/<[^>]+>/g`, which leaves malformed partials like `<scrip` (no
  closing `>`) in the output. Now does a two-step strip: well-formed
  tags replaced with a space, then any remaining `<` is also replaced
  with a space, so no tag-opener character can ever leak downstream.

Test coverage added:
- test/url-util.test.mjs — 10 assertions including a linear-time assert
- test/html-sanitize.test.mjs — 10 assertions; the `<scrip` partial
  and the double-decode regression are both pinned.

116 tests total, all pass (up from 96).
@askalf askalf enabled auto-merge (squash) April 23, 2026 00:54
@askalf askalf merged commit 89af9ac into master Apr 23, 2026
4 checks passed
@askalf askalf deleted the fix/codeql-alerts branch April 23, 2026 00:54
askalf added a commit that referenced this pull request Apr 23, 2026
)

CodeQL flagged `baseUrl.replace(/\/+$/, "")` in doctor.ts as polynomial-
ReDoS after PR #4 landed doctor.ts. Same class as the original 7 fixed
in #3; the regex is actually benign (no nested repetition) but using
the shared trimTrailingSlashes helper from url-util.ts keeps the
pattern consistent and CodeQL's query clean.

Co-authored-by: askalf <263217947+askalf@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant