Update search engine requests and result parsing by fccview · Pull Request #17 · degoog-org/official-extensions

fccview · 2026-06-27T09:29:34Z

Summary by CodeRabbit

New Features
- Wikipedia now builds result links using the selected language.
- Safer-search behavior is applied more consistently across Bing and DuckDuckGo (including Images and News) using cookie-based state.
Bug Fixes
- Improved Ecosia result parsing and URL validation.
- Startpage now detects CAPTCHA/anti-bot pages and stops early with a clearer error.
- Reddit time filters now reliably fall back to “all” for unsupported values.
- Improved resilience when warmed requests are blocked (more reliable fallback behavior).
Chores
- Updated bundled component versions.

…nge for the better, needs testing

coderabbitai · 2026-06-27T09:29:47Z

📝 Walkthrough

Walkthrough

The PR updates Bing, DuckDuckGo, and DuckDuckGo Images requests to send safe-search state through query parameters and cookies, adds a language-derived Wikipedia host, expands Ecosia result parsing, adds User-Agent headers to several engines, adds CAPTCHA detection to Startpage, tightens Reddit time mapping, updates lolcat-4play cache and warmup keying, changes warmed fetch fallback behavior, and bumps engine versions in package.json.

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title is concise and accurately reflects the main changes to search engine requests and result handling across the PR.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch develop

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands.}

coderabbitai

Actionable comments posted: 11

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@engines/bing-videos/index.js`:
- Around line 60-61: Normalize safeSearch in the Bing video request builder
before using it, so invalid or persisted values do not diverge between the query
string and cookie. Update the logic in the request path that uses
this.safeSearch and adltCookie to first map safeSearch to the same
strict/moderate/off values used by the other Bing engines, then use that
normalized value consistently for both adlt and SRCHHPGUSR in the BingVideos
implementation.

In `@engines/duckduckgo-images/index.js`:
- Around line 103-114: The _region() helper in
engines/duckduckgo-images/index.js is generating invalid DuckDuckGo region
values like en-en when context.lang is English. Update _region() to special-case
English or map context.lang through a supported DDG country-language pair before
_headers() uses it for the l parameter and Cookie, while keeping the existing
fallback behavior for missing context.lang.

In `@engines/duckduckgo-news/index.js`:
- Around line 61-68: The safe-search mapping in the duckduckgo-news engine is
collapsing off and moderate to the same value, so users choosing off still send
the moderate setting. Update the mapping used in the request builder so the
equivalent of safeSearch handling in this engine distinguishes off from
moderate, and apply the same correction anywhere the `p` cookie/query parameter
is assembled in the affected block(s). Use the `safeMap`, `safeSearch`, and
request header/query construction logic to ensure `off` no longer resolves to
the moderate value.

In `@engines/duckduckgo/index.js`:
- Line 5: Add the explicit off safe-search state to SAFE_SEARCH_MAP so
DuckDuckGo requests can propagate the default setting correctly. Update the
mapping used by the safe-search resolution in the engine’s main request-building
flow so safeSearch = "off" resolves to a valid value instead of undefined,
ensuring the kp parameter and Cookie are set consistently in the logic around
the request builder and related safe-search handling paths.

In `@engines/ecosia/index.js`:
- Around line 75-76: The hostname check in the Ecosia URL handling is too loose
because `endsWith("ecosia.org")` also matches unrelated domains. Update the
guard in the `new URL(href)` / `parsed.hostname` logic to verify the registrable
domain boundary for Ecosia instead of a simple suffix match, so only true
`ecosia.org` hosts are skipped.

In `@engines/hacker-news/index.js`:
- Around line 53-56: The Hacker News request headers are using
context?.userAgent?.() with nullish coalescing, so an empty string still gets
sent as User-Agent instead of falling back. Update the header selection in the
request builder to treat blank results from context.userAgent() as missing, and
use the default UA when the returned value is empty; keep the change localized
to the headers block in the HN fetch path.

In `@engines/internet-archive/index.js`:
- Around line 38-42: The dynamic User-Agent fallback in the Internet Archive
fetch setup should use a truthy check instead of preserving empty strings.
Update the headers construction in the doFetch call inside the internet-archive
handler so the context?.userAgent?.() value only wins when it is non-empty, and
otherwise the default Mozilla-style string is used.

In `@engines/nasa-images/index.js`:
- Line 31: The NASA images request header currently uses `context?.userAgent?.()
?? "Mozilla/5.0 (compatible; degoog/1.0)"`, which still passes through an empty
string from `context.userAgent()`. Update the UA fallback logic in
`engines/nasa-images/index.js` so the default value is used when
`context.userAgent()` returns a blank string as well as when it is nullish,
keeping the fallback behavior in the same header construction site.

In `@engines/openverse/index.js`:
- Line 50: The Openverse header builder is allowing an empty User-Agent through
because `context.userAgent()` is only guarded with `??`, so `""` bypasses the
fallback. Update the logic in `engines/openverse/index.js` to normalize blank
values before constructing the headers, ensuring the `User-Agent` falls back to
the default when `context.userAgent()` returns an empty string.

In `@engines/the-guardian/index.js`:
- Around line 94-97: The User-Agent fallback in the request headers is not
applied when context.userAgent() returns an empty string because the current
nullish check treats it as valid. Update the header construction in the Guardian
request code to use a truthy fallback so empty UA strings also resolve to the
default Mozilla-compatible value, and keep the logic centered around the
existing context.userAgent() call in the headers block.

In `@engines/wikipedia/index.js`:
- Around line 6-12: The host selection in `_wikiHost` is too restrictive because
`LANG_RE` only accepts 2–3 letter codes, so valid Wikipedia identifiers like
`simple` or locale-tagged values such as `pt-BR` fall back to English. Update
the language normalization logic in `engines/wikipedia/index.js` to map
`context.lang` to a supported Wikipedia host code using an allowlist or explicit
normalization step, and make `_wikiHost` use that normalized value so these
inputs resolve to the correct Wikipedia domain instead of `en.wikipedia.org`.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: b9c369de-a493-48a7-b18a-3716349bc72f

📥 Commits

Reviewing files that changed from the base of the PR and between 04bfd13 and 267d88d.

📒 Files selected for processing (16)

engines/bing-images/index.js
engines/bing-videos/index.js
engines/bing/index.js
engines/duckduckgo-images/index.js
engines/duckduckgo-news/index.js
engines/duckduckgo/index.js
engines/ecosia/index.js
engines/hacker-news/index.js
engines/internet-archive/index.js
engines/nasa-images/index.js
engines/openverse/index.js
engines/reddit/index.js
engines/startpage/index.js
engines/the-guardian/index.js
engines/wikipedia/index.js
package.json

coderabbitai · 2026-06-27T09:36:33Z

    if (this.safeSearch !== "off") url += `&adlt=${this.safeSearch}`;
+    const adltCookie = { strict: "STRICT", moderate: "DEMOTE", off: "OFF" }[this.safeSearch] ?? "OFF";


🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Normalize safeSearch before building the Bing request.

configure() accepts any string, but this branch sends this.safeSearch straight into adlt while the cookie path falls back unknown values to OFF. That makes the two signals diverge for invalid persisted/programmatic values (adlt=foo + SRCHHPGUSR=ADLT=OFF), unlike engines/bing/index.js and engines/bing-images/index.js, which normalize first.

Proposed fix

- if (this.safeSearch !== "off") url += `&adlt=${this.safeSearch}`; - const adltCookie = { strict: "STRICT", moderate: "DEMOTE", off: "OFF" }[this.safeSearch] ?? "OFF"; + const adlt = + this.safeSearch === "strict" || this.safeSearch === "moderate" + ? this.safeSearch + : "off"; + if (adlt !== "off") url += `&adlt=${adlt}`; + const adltCookie = { strict: "STRICT", moderate: "DEMOTE", off: "OFF" }[adlt] ?? "OFF";

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if (this.safeSearch !== "off") url += `&adlt=${this.safeSearch}`;

const adltCookie = { strict: "STRICT", moderate: "DEMOTE", off: "OFF" }[this.safeSearch] ?? "OFF";

const adlt =

this.safeSearch === "strict" || this.safeSearch === "moderate"

? this.safeSearch

: "off";

if (adlt !== "off") url += `&adlt=${adlt}`;

const adltCookie = { strict: "STRICT", moderate: "DEMOTE", off: "OFF" }[adlt] ?? "OFF";

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@engines/bing-videos/index.js` around lines 60 - 61, Normalize safeSearch in the Bing video request builder before using it, so invalid or persisted values do not diverge between the query string and cookie. Update the logic in the request path that uses this.safeSearch and adltCookie to first map safeSearch to the same strict/moderate/off values used by the other Bing engines, then use that normalized value consistently for both adlt and SRCHHPGUSR in the BingVideos implementation.

coderabbitai · 2026-06-27T09:36:33Z

+  _region(context) {
+    return context?.lang ? `${context.lang}-${context.lang}` : "us-en";
+  }
+
+  _headers(context, safe, region) {
+    return {
+      "User-Agent": context?.userAgent?.() ?? FALLBACK_UA,
      Accept: "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
      "Accept-Language": context?.buildAcceptLanguage?.() ?? "en-US,en;q=0.9",
      "Accept-Encoding": "gzip, deflate, br",
+      Cookie: `p=${safe}; ah=${region}; l=${region}`,
    };


🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Verify how context.lang is populated and whether a region/locale helper already exists. rg -n -C3 '\bcontext\?\.lang\b|\bcontext\.lang\b|buildAcceptLanguage|kl|region' .

Repository: degoog-org/official-extensions

Length of output: 28283

🏁 Script executed:

#!/bin/bash set -euo pipefail ast-grep outline engines/duckduckgo-images/index.js --view expanded printf '\n--- duckduckgo-news ---\n' ast-grep outline engines/duckduckgo-news/index.js --view expanded printf '\n--- related lang/region helpers ---\n' rg -n -C2 '\b_buildKl\b|\bcontext\.lang\b|\blang === "en"\b|\bus-en\b|\ben-en\b|\bkl\b|\bl=\b' engines/duckduckgo*.js autocomplete/*.js engines/*/*.js

Repository: degoog-org/official-extensions

Length of output: 4453

🌐 Web query:

DuckDuckGo images region parameter kl values en-en us-en documentation accepted region codes

💡 Result:

DuckDuckGo uses the kl (keyboard language/region) URL parameter to specify the region and language settings for search queries, including image searches [1][2][3]. For standard web searches, you append the kl parameter to the URL, for example: https://duckduckgo.com/?q=search&kl=us-en [1][2]. For image searches, while the parameter is technically kl in the broader search system [3], some third-party API implementations or wrappers may refer to this setting as locale [4] or include it within their own library's region argument [5]. The kl value follows a specific format: a two-letter country code followed by a hyphen and a two-letter language code (e.g., us-en for United States English) [1][6][7]. The setting wt-wt is commonly used to represent "no region" (worldwide) [3][8][7]. Accepted Region Codes DuckDuckGo supports a wide range of regional codes. Below is a selection of commonly used codes [7]: - us-en: United States (English) - uk-en: United Kingdom (English) - ca-en: Canada (English) - ca-fr: Canada (French) - fr-fr: France (French) - de-de: Germany (German) - es-es: Spain (Spanish) - jp-jp: Japan (Japanese) - wt-wt: No region/Worldwide A comprehensive list of supported region codes can be found in technical documentation repositories, such as those maintained by search API providers [6][8][7]. Note that while the kl parameter is the standard way to set regions in the URL, some automated scraping tools may require additional parameters like vqd (Validation Query Digest) to successfully query DuckDuckGo's internal endpoints (e.g., i.js for images) [3][9].

Citations:

1: https://duckduckgo.com/duckduckgo-help-pages/settings/params/

2: https://duck.com/params

3: https://docs.searxng.org/dev/engines/online/duckduckgo.html

4: https://www.searchapi.io/docs/duckduckgo-images-api

5: https://pypi.org/project/duckduckgo-search/8.1.0/

6: https://serpapi.com/duckduckgo-regions

7: https://developer.novada.com/novada/advanced-proxy-solutions/scraper-api-original/parametric/supported-duckduckgo-regions.md

8: https://docs.scrapingdog.com/duckduckgo-scraper-api/duckduckgo-supported-regions

9: https://github.com/ryanhlewis/DuckDuckGo-Image-Scraper

Avoid emitting en-en for DuckDuckGo Images. DuckDuckGo region values are country-language pairs like us-en; _region() currently turns context.lang === "en" into en-en and sends it through both l and the cookie. Special-case English or map context.lang to a supported DDG region first.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@engines/duckduckgo-images/index.js` around lines 103 - 114, The _region() helper in engines/duckduckgo-images/index.js is generating invalid DuckDuckGo region values like en-en when context.lang is English. Update _region() to special-case English or map context.lang through a supported DDG country-language pair before _headers() uses it for the l parameter and Cookie, while keeping the existing fallback behavior for missing context.lang.

coderabbitai · 2026-06-27T09:36:33Z

+    const safeMap = { off: "-1", moderate: "-1", strict: "1" };
+    const safe = safeMap[this.safeSearch] ?? "-1";
    const headers = {
      "User-Agent": ua,
      Accept: "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
      "Accept-Language": acceptLang,
      "Accept-Encoding": "gzip, deflate, br",
+      Cookie: `p=${safe}`,


🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Don’t map off to the moderate safe-search value.

Line 61 makes off and moderate both resolve to "-1", so users selecting Safe Search off still send the moderate value in both Cookie: p=... and the p query parameter. This also diverges from the images engine’s SAFE_OFF = "-2" / SAFE_MODERATE = "-1" mapping.

Proposed fix

- const safeMap = { off: "-1", moderate: "-1", strict: "1" }; + const safeMap = { off: "-2", moderate: "-1", strict: "1" };

Also applies to: 81-88

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@engines/duckduckgo-news/index.js` around lines 61 - 68, The safe-search mapping in the duckduckgo-news engine is collapsing off and moderate to the same value, so users choosing off still send the moderate setting. Update the mapping used in the request builder so the equivalent of safeSearch handling in this engine distinguishes off from moderate, and apply the same correction anywhere the `p` cookie/query parameter is assembled in the affected block(s). Use the `safeMap`, `safeSearch`, and request header/query construction logic to ensure `off` no longer resolves to the moderate value.

coderabbitai · 2026-06-27T09:36:33Z

 const FALLBACK_UA = "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/133.0.0.0 Safari/537.36";

-const SAFE_SEARCH_MAP = { moderate: "-2", strict: "1" };
+const SAFE_SEARCH_MAP = { moderate: "-1", strict: "1" };


🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Send the explicit off safe-search state too.

safeSearch = "off" is a supported option, but it is missing from SAFE_SEARCH_MAP, so Line 46 resolves safe to undefined and Lines 50/69 omit both kp and Cookie. Add the off mapping so the default setting is actually propagated.

Proposed fix

-const SAFE_SEARCH_MAP = { moderate: "-1", strict: "1" }; +const SAFE_SEARCH_MAP = { off: "-2", moderate: "-1", strict: "1" };

Also applies to: 46-50, 69-69

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@engines/duckduckgo/index.js` at line 5, Add the explicit off safe-search state to SAFE_SEARCH_MAP so DuckDuckGo requests can propagate the default setting correctly. Update the mapping used by the safe-search resolution in the engine’s main request-building flow so safeSearch = "off" resolves to a valid value instead of undefined, ensuring the kp parameter and Cookie are set consistently in the logic around the request builder and related safe-search handling paths.

coderabbitai · 2026-06-27T09:36:33Z

          const parsed = new URL(href);
-          if (parsed.hostname === "www.ecosia.org") return;
+          if (parsed.hostname.endsWith("ecosia.org")) return;


🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win

Tighten the Ecosia hostname guard.

endsWith("ecosia.org") also matches unrelated domains like notecosia.org, so this can drop legitimate search results. Match the registrable domain boundary instead.

Suggested fix

- if (parsed.hostname.endsWith("ecosia.org")) return; + if (parsed.hostname === "ecosia.org" || parsed.hostname.endsWith(".ecosia.org")) return;

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

const parsed = new URL(href);

if (parsed.hostname === "www.ecosia.org") return;

if (parsed.hostname.endsWith("ecosia.org")) return;

const parsed = new URL(href);

if (parsed.hostname === "ecosia.org" || parsed.hostname.endsWith(".ecosia.org")) return;

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@engines/ecosia/index.js` around lines 75 - 76, The hostname check in the Ecosia URL handling is too loose because `endsWith("ecosia.org")` also matches unrelated domains. Update the guard in the `new URL(href)` / `parsed.hostname` logic to verify the registrable domain boundary for Ecosia instead of a simple suffix match, so only true `ecosia.org` hosts are skipped.

coderabbitai · 2026-06-27T09:36:33Z

    const response = await doFetch(url, {
      headers: {
-        "User-Agent": "Mozilla/5.0 (compatible; degoog/1.0)",
+        "User-Agent": context?.userAgent?.() ?? "Mozilla/5.0 (compatible; degoog/1.0)",
        Accept: "application/json",
      },


🩺 Stability & Availability | 🟡 Minor | ⚡ Quick win

Use a truthy fallback for the dynamic UA.

Line 40 preserves "" because it uses ??, so a missing upstream UA still results in an empty User-Agent header instead of the fallback string.

Suggested fix

- "User-Agent": context?.userAgent?.() ?? "Mozilla/5.0 (compatible; degoog/1.0)", + "User-Agent": context?.userAgent?.() || "Mozilla/5.0 (compatible; degoog/1.0)",

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

const response = await doFetch(url, {

headers: {

"User-Agent": "Mozilla/5.0 (compatible; degoog/1.0)",

"User-Agent": context?.userAgent?.() ?? "Mozilla/5.0 (compatible; degoog/1.0)",

Accept: "application/json",

},

const response = await doFetch(url, {

headers: {

"User-Agent": context?.userAgent?.() || "Mozilla/5.0 (compatible; degoog/1.0)",

Accept: "application/json",

},

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@engines/internet-archive/index.js` around lines 38 - 42, The dynamic User-Agent fallback in the Internet Archive fetch setup should use a truthy check instead of preserving empty strings. Update the headers construction in the doFetch call inside the internet-archive handler so the context?.userAgent?.() value only wins when it is non-empty, and otherwise the default Mozilla-style string is used.

coderabbitai · 2026-06-27T09:36:34Z

        headers: {
          Accept: "application/json",
          "Accept-Language": context?.buildAcceptLanguage?.() ?? "en,en-US;q=0.9",
+          "User-Agent": context?.userAgent?.() ?? "Mozilla/5.0 (compatible; degoog/1.0)",


🩺 Stability & Availability | 🟡 Minor | ⚡ Quick win

Blank upstream UAs still bypass the fallback here.

Line 31 uses ??, so context.userAgent() returning "" sends an empty header value instead of the default UA.

Suggested fix

- "User-Agent": context?.userAgent?.() ?? "Mozilla/5.0 (compatible; degoog/1.0)", + "User-Agent": context?.userAgent?.() || "Mozilla/5.0 (compatible; degoog/1.0)",

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

"User-Agent": context?.userAgent?.() ?? "Mozilla/5.0 (compatible; degoog/1.0)",

"User-Agent": context?.userAgent?.() || "Mozilla/5.0 (compatible; degoog/1.0)",

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@engines/nasa-images/index.js` at line 31, The NASA images request header currently uses `context?.userAgent?.() ?? "Mozilla/5.0 (compatible; degoog/1.0)"`, which still passes through an empty string from `context.userAgent()`. Update the UA fallback logic in `engines/nasa-images/index.js` so the default value is used when `context.userAgent()` returns a blank string as well as when it is nullish, keeping the fallback behavior in the same header construction site.

coderabbitai · 2026-06-27T09:36:34Z

        headers: {
          Accept: "application/json",
          "Accept-Language": context?.buildAcceptLanguage?.() ?? "en,en-US;q=0.9",
+          "User-Agent": context?.userAgent?.() ?? "Mozilla/5.0 (compatible; degoog/1.0)",


🩺 Stability & Availability | 🟡 Minor | ⚡ Quick win

Normalize empty UA values before building headers.

Line 50 uses ??, which means "" from context.userAgent() skips the fallback and leaves the request with a blank User-Agent.

Suggested fix

- "User-Agent": context?.userAgent?.() ?? "Mozilla/5.0 (compatible; degoog/1.0)", + "User-Agent": context?.userAgent?.() || "Mozilla/5.0 (compatible; degoog/1.0)",

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

"User-Agent": context?.userAgent?.() ?? "Mozilla/5.0 (compatible; degoog/1.0)",

"User-Agent": context?.userAgent?.() || "Mozilla/5.0 (compatible; degoog/1.0)",

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@engines/openverse/index.js` at line 50, The Openverse header builder is allowing an empty User-Agent through because `context.userAgent()` is only guarded with `??`, so `""` bypasses the fallback. Update the logic in `engines/openverse/index.js` to normalize blank values before constructing the headers, ensuring the `User-Agent` falls back to the default when `context.userAgent()` returns an empty string.

coderabbitai · 2026-06-27T09:36:34Z

+        headers: {
+          Accept: "application/json",
+          "User-Agent": context?.userAgent?.() ?? "Mozilla/5.0 (compatible; degoog/1.0)",
+        },


🩺 Stability & Availability | 🟡 Minor | ⚡ Quick win

The fallback does not cover empty UA strings.

Line 96 uses ??, so an empty string from context.userAgent() is treated as valid and the request still goes out without the intended fallback UA.

Suggested fix

- "User-Agent": context?.userAgent?.() ?? "Mozilla/5.0 (compatible; degoog/1.0)", + "User-Agent": context?.userAgent?.() || "Mozilla/5.0 (compatible; degoog/1.0)",

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

headers: {

Accept: "application/json",

"User-Agent": context?.userAgent?.() ?? "Mozilla/5.0 (compatible; degoog/1.0)",

},

headers: {

Accept: "application/json",

"User-Agent": context?.userAgent?.() || "Mozilla/5.0 (compatible; degoog/1.0)",

},

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@engines/the-guardian/index.js` around lines 94 - 97, The User-Agent fallback in the request headers is not applied when context.userAgent() returns an empty string because the current nullish check treats it as valid. Update the header construction in the Guardian request code to use a truthy fallback so empty UA strings also resolve to the default Mozilla-compatible value, and keep the logic centered around the existing context.userAgent() call in the headers block.

coderabbitai · 2026-06-27T09:36:34Z

+const LANG_RE = /^[a-z]{2,3}$/;
+
+const _wikiHost = (lang) => {
+  if (lang && LANG_RE.test(lang)) return `${lang}.wikipedia.org`;
+  return "en.wikipedia.org";
+};
+


🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Inspect the repository contract for context.lang and look for locale-style values. rg -n -C2 --type=js '\bcontext\??\.lang\b' . rg -n -C2 --type=js '\blang\b' engines

Repository: degoog-org/official-extensions

Length of output: 22016

LANG_RE excludes valid Wikipedia language hosts.

simple and locale-tagged inputs like pt-BR fall back to en.wikipedia.org, so language-aware routing is lost for those cases. Normalize context.lang to a Wikipedia host code or use an allowlist instead of this regex.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@engines/wikipedia/index.js` around lines 6 - 12, The host selection in `_wikiHost` is too restrictive because `LANG_RE` only accepts 2–3 letter codes, so valid Wikipedia identifiers like `simple` or locale-tagged values such as `pt-BR` fall back to English. Update the language normalization logic in `engines/wikipedia/index.js` to map `context.lang` to a supported Wikipedia host code using an allowlist or explicit normalization step, and make `_wikiHost` use that normalized value so these inputs resolve to the correct Wikipedia domain instead of `en.wikipedia.org`.

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@transports/lolcat-4play/index.js`:
- Around line 394-420: The `_loadCookieJar` path is preferring the shared
`_cookieCache` over the fresher in-memory jar, which can let stale session state
override recent writes from `_persistCookieJar`. Update `_loadCookieJar` to
check `_cookieJarTextSessions` first for the `memKey` and return that value when
present, falling back to `_cookieCache` only if no in-memory jar exists. Keep
the existing error handling and use the same `cookieJarKeyFor`, `_memKey`, and
`_cacheKey` symbols so the live cookie state always wins within the process.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: b20920d4-315d-44fb-a79c-9a076350aeef

📥 Commits

Reviewing files that changed from the base of the PR and between 267d88d and eb3f45c.

📒 Files selected for processing (2)

package.json
transports/lolcat-4play/index.js

🚧 Files skipped from review as they are similar to previous changes (1)

package.json

coderabbitai · 2026-06-28T15:18:02Z

  async _loadCookieJar(origin, containerId) {
-    const key = cookieJarKeyFor(origin, this._containerConfigKey || "default");
+    const memKey = cookieJarKeyFor(origin, this._memKey(containerId));
+    const cacheKey = cookieJarKeyFor(origin, this._cacheKey());
    if (this._cookieCache) {
      try {
-        const cached = await this._cookieCache.get(key);
+        const cached = await this._cookieCache.get(cacheKey);
        if (cached) return cached;
      } catch (error) {
        console.warn(
          `[lolcat-4play] failed to read cookie jar for ${origin}: ${error?.message || error}`,
        );
      }
    }
-    return this._cookieJarTextSessions.get(key) || null;
+    return this._cookieJarTextSessions.get(memKey) || null;
  }

  async _loadSessionFromCache(origin, containerId) {
-    const key = cookieJarKeyFor(origin, this._containerConfigKey || "default");
-    const warmupKey = warmupKeyFor(origin, this._containerConfigKey || "default");
+    const cacheKey = cookieJarKeyFor(origin, this._cacheKey());
+    const warmupKey = warmupKeyFor(origin, this._memKey(containerId));
    if (this._browserHeaderSessions.has(warmupKey)) {
      return;
    }

    if (this._cookieCache) {
      try {
-        const cachedCookies = await this._cookieCache.get(key);
-        const cachedHeadersJson = await this._cookieCache.get(key + ":headers");
+        const cachedCookies = await this._cookieCache.get(cacheKey);
+        const cachedHeadersJson = await this._cookieCache.get(cacheKey + ":headers");


🩺 Stability & Availability | 🟠 Major | ⚡ Quick win

Prefer the live in-memory cookie jar before shared cache.

_persistCookieJar updates _cookieJarTextSessions immediately, but shared cache writes are async and can fail. Since _loadCookieJar reads _cookieCache first, a stale cached jar can override fresher in-process cookies and keep warmed curl requests using old session state.

Proposed fix

async _loadCookieJar(origin, containerId) { const memKey = cookieJarKeyFor(origin, this._memKey(containerId)); const cacheKey = cookieJarKeyFor(origin, this._cacheKey()); + const inMemory = this._cookieJarTextSessions.get(memKey); + if (typeof inMemory === "string") return inMemory; + if (this._cookieCache) { try { const cached = await this._cookieCache.get(cacheKey); - if (cached) return cached; + if (cached) { + this._cookieJarTextSessions.set(memKey, cached); + return cached; + } } catch (error) { console.warn( `[lolcat-4play] failed to read cookie jar for ${origin}: ${error?.message || error}`, ); } } - return this._cookieJarTextSessions.get(memKey) || null; + return null; } async _loadSessionFromCache(origin, containerId) { const cacheKey = cookieJarKeyFor(origin, this._cacheKey()); + const memKey = cookieJarKeyFor(origin, this._memKey(containerId)); const warmupKey = warmupKeyFor(origin, this._memKey(containerId)); if (this._browserHeaderSessions.has(warmupKey)) { return; @@ }; this._browserHeaderSessions.set(warmupKey, session); + this._cookieJarTextSessions.set(memKey, cachedCookies); this._setWarmupState(origin, containerId, { warmedAt: Date.now() });

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

async _loadCookieJar(origin, containerId) {

const key = cookieJarKeyFor(origin, this._containerConfigKey || "default");

const memKey = cookieJarKeyFor(origin, this._memKey(containerId));

const cacheKey = cookieJarKeyFor(origin, this._cacheKey());

if (this._cookieCache) {

try {

const cached = await this._cookieCache.get(key);

const cached = await this._cookieCache.get(cacheKey);

if (cached) return cached;

} catch (error) {

console.warn(

`[lolcat-4play] failed to read cookie jar for ${origin}: ${error?.message || error}`,

);

}

}

return this._cookieJarTextSessions.get(key) || null;

return this._cookieJarTextSessions.get(memKey) || null;

}

async _loadSessionFromCache(origin, containerId) {

const key = cookieJarKeyFor(origin, this._containerConfigKey || "default");

const warmupKey = warmupKeyFor(origin, this._containerConfigKey || "default");

const cacheKey = cookieJarKeyFor(origin, this._cacheKey());

const warmupKey = warmupKeyFor(origin, this._memKey(containerId));

if (this._browserHeaderSessions.has(warmupKey)) {

return;

}

if (this._cookieCache) {

try {

const cachedCookies = await this._cookieCache.get(key);

const cachedHeadersJson = await this._cookieCache.get(key + ":headers");

const cachedCookies = await this._cookieCache.get(cacheKey);

const cachedHeadersJson = await this._cookieCache.get(cacheKey + ":headers");

async _loadCookieJar(origin, containerId) {

const memKey = cookieJarKeyFor(origin, this._memKey(containerId));

const cacheKey = cookieJarKeyFor(origin, this._cacheKey());

const inMemory = this._cookieJarTextSessions.get(memKey);

if (typeof inMemory === "string") return inMemory;

if (this._cookieCache) {

try {

const cached = await this._cookieCache.get(cacheKey);

if (cached) {

this._cookieJarTextSessions.set(memKey, cached);

return cached;

}

} catch (error) {

console.warn(

`[lolcat-4play] failed to read cookie jar for ${origin}: ${error?.message || error}`,

);

}

}

return null;

}

async _loadSessionFromCache(origin, containerId) {

const cacheKey = cookieJarKeyFor(origin, this._cacheKey());

const memKey = cookieJarKeyFor(origin, this._memKey(containerId));

const warmupKey = warmupKeyFor(origin, this._memKey(containerId));

if (this._browserHeaderSessions.has(warmupKey)) {

return;

}

if (this._cookieCache) {

try {

const cachedCookies = await this._cookieCache.get(cacheKey);

const cachedHeadersJson = await this._cookieCache.get(cacheKey + ":headers");

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@transports/lolcat-4play/index.js` around lines 394 - 420, The `_loadCookieJar` path is preferring the shared `_cookieCache` over the fresher in-memory jar, which can let stale session state override recent writes from `_persistCookieJar`. Update `_loadCookieJar` to check `_cookieJarTextSessions` first for the `memKey` and return that value when present, falling back to `_cookieCache` only if no in-memory jar exists. Keep the existing error handling and use the same `cookieJarKeyFor`, `_memKey`, and `_cacheKey` symbols so the live cookie state always wins within the process.

change all engines to standardise with latest degoog updates, big cha…

267d88d

…nge for the better, needs testing

coderabbitai Bot changed the title ~~@coderabbitai~~ Update search engine requests and result parsing Jun 27, 2026

coderabbitai Bot reviewed Jun 27, 2026

View reviewed changes

fix startpage issue, container id issue and tainted session problem

eb3f45c

coderabbitai Bot reviewed Jun 28, 2026

View reviewed changes

		if (this.safeSearch !== "off") url += `&adlt=${this.safeSearch}`;
		const adltCookie = { strict: "STRICT", moderate: "DEMOTE", off: "OFF" }[this.safeSearch] ?? "OFF";

-    if (this.safeSearch !== "off") url += `&adlt=${this.safeSearch}`;
-    const adltCookie = { strict: "STRICT", moderate: "DEMOTE", off: "OFF" }[this.safeSearch] ?? "OFF";
+    const adlt =
+      this.safeSearch === "strict" || this.safeSearch === "moderate"
+        ? this.safeSearch
+        : "off";
+    if (adlt !== "off") url += `&adlt=${adlt}`;
+    const adltCookie = { strict: "STRICT", moderate: "DEMOTE", off: "OFF" }[adlt] ?? "OFF";

	"User-Agent": context?.userAgent?.() ?? "Mozilla/5.0 (compatible; degoog/1.0)",
	"User-Agent": context?.userAgent?.() \|\| "Mozilla/5.0 (compatible; degoog/1.0)",

Uh oh!

Conversation

fccview commented Jun 27, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 27, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 27, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 27, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 27, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 27, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 27, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 27, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 27, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 27, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 27, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 28, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

fccview commented Jun 27, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 27, 2026 •

edited

Loading