Conversation
…nge for the better, needs testing
📝 WalkthroughWalkthroughThe PR updates Bing, DuckDuckGo, and DuckDuckGo Images requests to send safe-search state through query parameters and cookies, adds a language-derived Wikipedia host, expands Ecosia result parsing, adds User-Agent headers to several engines, adds CAPTCHA detection to Startpage, tightens Reddit time mapping, updates 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 11
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@engines/bing-videos/index.js`:
- Around line 60-61: Normalize safeSearch in the Bing video request builder
before using it, so invalid or persisted values do not diverge between the query
string and cookie. Update the logic in the request path that uses
this.safeSearch and adltCookie to first map safeSearch to the same
strict/moderate/off values used by the other Bing engines, then use that
normalized value consistently for both adlt and SRCHHPGUSR in the BingVideos
implementation.
In `@engines/duckduckgo-images/index.js`:
- Around line 103-114: The _region() helper in
engines/duckduckgo-images/index.js is generating invalid DuckDuckGo region
values like en-en when context.lang is English. Update _region() to special-case
English or map context.lang through a supported DDG country-language pair before
_headers() uses it for the l parameter and Cookie, while keeping the existing
fallback behavior for missing context.lang.
In `@engines/duckduckgo-news/index.js`:
- Around line 61-68: The safe-search mapping in the duckduckgo-news engine is
collapsing off and moderate to the same value, so users choosing off still send
the moderate setting. Update the mapping used in the request builder so the
equivalent of safeSearch handling in this engine distinguishes off from
moderate, and apply the same correction anywhere the `p` cookie/query parameter
is assembled in the affected block(s). Use the `safeMap`, `safeSearch`, and
request header/query construction logic to ensure `off` no longer resolves to
the moderate value.
In `@engines/duckduckgo/index.js`:
- Line 5: Add the explicit off safe-search state to SAFE_SEARCH_MAP so
DuckDuckGo requests can propagate the default setting correctly. Update the
mapping used by the safe-search resolution in the engine’s main request-building
flow so safeSearch = "off" resolves to a valid value instead of undefined,
ensuring the kp parameter and Cookie are set consistently in the logic around
the request builder and related safe-search handling paths.
In `@engines/ecosia/index.js`:
- Around line 75-76: The hostname check in the Ecosia URL handling is too loose
because `endsWith("ecosia.org")` also matches unrelated domains. Update the
guard in the `new URL(href)` / `parsed.hostname` logic to verify the registrable
domain boundary for Ecosia instead of a simple suffix match, so only true
`ecosia.org` hosts are skipped.
In `@engines/hacker-news/index.js`:
- Around line 53-56: The Hacker News request headers are using
context?.userAgent?.() with nullish coalescing, so an empty string still gets
sent as User-Agent instead of falling back. Update the header selection in the
request builder to treat blank results from context.userAgent() as missing, and
use the default UA when the returned value is empty; keep the change localized
to the headers block in the HN fetch path.
In `@engines/internet-archive/index.js`:
- Around line 38-42: The dynamic User-Agent fallback in the Internet Archive
fetch setup should use a truthy check instead of preserving empty strings.
Update the headers construction in the doFetch call inside the internet-archive
handler so the context?.userAgent?.() value only wins when it is non-empty, and
otherwise the default Mozilla-style string is used.
In `@engines/nasa-images/index.js`:
- Line 31: The NASA images request header currently uses `context?.userAgent?.()
?? "Mozilla/5.0 (compatible; degoog/1.0)"`, which still passes through an empty
string from `context.userAgent()`. Update the UA fallback logic in
`engines/nasa-images/index.js` so the default value is used when
`context.userAgent()` returns a blank string as well as when it is nullish,
keeping the fallback behavior in the same header construction site.
In `@engines/openverse/index.js`:
- Line 50: The Openverse header builder is allowing an empty User-Agent through
because `context.userAgent()` is only guarded with `??`, so `""` bypasses the
fallback. Update the logic in `engines/openverse/index.js` to normalize blank
values before constructing the headers, ensuring the `User-Agent` falls back to
the default when `context.userAgent()` returns an empty string.
In `@engines/the-guardian/index.js`:
- Around line 94-97: The User-Agent fallback in the request headers is not
applied when context.userAgent() returns an empty string because the current
nullish check treats it as valid. Update the header construction in the Guardian
request code to use a truthy fallback so empty UA strings also resolve to the
default Mozilla-compatible value, and keep the logic centered around the
existing context.userAgent() call in the headers block.
In `@engines/wikipedia/index.js`:
- Around line 6-12: The host selection in `_wikiHost` is too restrictive because
`LANG_RE` only accepts 2–3 letter codes, so valid Wikipedia identifiers like
`simple` or locale-tagged values such as `pt-BR` fall back to English. Update
the language normalization logic in `engines/wikipedia/index.js` to map
`context.lang` to a supported Wikipedia host code using an allowlist or explicit
normalization step, and make `_wikiHost` use that normalized value so these
inputs resolve to the correct Wikipedia domain instead of `en.wikipedia.org`.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: b9c369de-a493-48a7-b18a-3716349bc72f
📒 Files selected for processing (16)
engines/bing-images/index.jsengines/bing-videos/index.jsengines/bing/index.jsengines/duckduckgo-images/index.jsengines/duckduckgo-news/index.jsengines/duckduckgo/index.jsengines/ecosia/index.jsengines/hacker-news/index.jsengines/internet-archive/index.jsengines/nasa-images/index.jsengines/openverse/index.jsengines/reddit/index.jsengines/startpage/index.jsengines/the-guardian/index.jsengines/wikipedia/index.jspackage.json
| if (this.safeSearch !== "off") url += `&adlt=${this.safeSearch}`; | ||
| const adltCookie = { strict: "STRICT", moderate: "DEMOTE", off: "OFF" }[this.safeSearch] ?? "OFF"; |
There was a problem hiding this comment.
🎯 Functional Correctness | 🟠 Major | ⚡ Quick win
Normalize safeSearch before building the Bing request.
configure() accepts any string, but this branch sends this.safeSearch straight into adlt while the cookie path falls back unknown values to OFF. That makes the two signals diverge for invalid persisted/programmatic values (adlt=foo + SRCHHPGUSR=ADLT=OFF), unlike engines/bing/index.js and engines/bing-images/index.js, which normalize first.
Proposed fix
- if (this.safeSearch !== "off") url += `&adlt=${this.safeSearch}`;
- const adltCookie = { strict: "STRICT", moderate: "DEMOTE", off: "OFF" }[this.safeSearch] ?? "OFF";
+ const adlt =
+ this.safeSearch === "strict" || this.safeSearch === "moderate"
+ ? this.safeSearch
+ : "off";
+ if (adlt !== "off") url += `&adlt=${adlt}`;
+ const adltCookie = { strict: "STRICT", moderate: "DEMOTE", off: "OFF" }[adlt] ?? "OFF";📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if (this.safeSearch !== "off") url += `&adlt=${this.safeSearch}`; | |
| const adltCookie = { strict: "STRICT", moderate: "DEMOTE", off: "OFF" }[this.safeSearch] ?? "OFF"; | |
| const adlt = | |
| this.safeSearch === "strict" || this.safeSearch === "moderate" | |
| ? this.safeSearch | |
| : "off"; | |
| if (adlt !== "off") url += `&adlt=${adlt}`; | |
| const adltCookie = { strict: "STRICT", moderate: "DEMOTE", off: "OFF" }[adlt] ?? "OFF"; |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@engines/bing-videos/index.js` around lines 60 - 61, Normalize safeSearch in
the Bing video request builder before using it, so invalid or persisted values
do not diverge between the query string and cookie. Update the logic in the
request path that uses this.safeSearch and adltCookie to first map safeSearch to
the same strict/moderate/off values used by the other Bing engines, then use
that normalized value consistently for both adlt and SRCHHPGUSR in the
BingVideos implementation.
| _region(context) { | ||
| return context?.lang ? `${context.lang}-${context.lang}` : "us-en"; | ||
| } | ||
|
|
||
| _headers(context, safe, region) { | ||
| return { | ||
| "User-Agent": context?.userAgent?.() ?? FALLBACK_UA, | ||
| Accept: "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8", | ||
| "Accept-Language": context?.buildAcceptLanguage?.() ?? "en-US,en;q=0.9", | ||
| "Accept-Encoding": "gzip, deflate, br", | ||
| Cookie: `p=${safe}; ah=${region}; l=${region}`, | ||
| }; |
There was a problem hiding this comment.
🎯 Functional Correctness | 🟠 Major | ⚡ Quick win
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify how context.lang is populated and whether a region/locale helper already exists.
rg -n -C3 '\bcontext\?\.lang\b|\bcontext\.lang\b|buildAcceptLanguage|kl|region' .Repository: degoog-org/official-extensions
Length of output: 28283
🏁 Script executed:
#!/bin/bash
set -euo pipefail
ast-grep outline engines/duckduckgo-images/index.js --view expanded
printf '\n--- duckduckgo-news ---\n'
ast-grep outline engines/duckduckgo-news/index.js --view expanded
printf '\n--- related lang/region helpers ---\n'
rg -n -C2 '\b_buildKl\b|\bcontext\.lang\b|\blang === "en"\b|\bus-en\b|\ben-en\b|\bkl\b|\bl=\b' engines/duckduckgo*.js autocomplete/*.js engines/*/*.jsRepository: degoog-org/official-extensions
Length of output: 4453
🌐 Web query:
DuckDuckGo images region parameter kl values en-en us-en documentation accepted region codes
💡 Result:
DuckDuckGo uses the kl (keyboard language/region) URL parameter to specify the region and language settings for search queries, including image searches [1][2][3]. For standard web searches, you append the kl parameter to the URL, for example: https://duckduckgo.com/?q=search&kl=us-en [1][2]. For image searches, while the parameter is technically kl in the broader search system [3], some third-party API implementations or wrappers may refer to this setting as locale [4] or include it within their own library's region argument [5]. The kl value follows a specific format: a two-letter country code followed by a hyphen and a two-letter language code (e.g., us-en for United States English) [1][6][7]. The setting wt-wt is commonly used to represent "no region" (worldwide) [3][8][7]. Accepted Region Codes DuckDuckGo supports a wide range of regional codes. Below is a selection of commonly used codes [7]: - us-en: United States (English) - uk-en: United Kingdom (English) - ca-en: Canada (English) - ca-fr: Canada (French) - fr-fr: France (French) - de-de: Germany (German) - es-es: Spain (Spanish) - jp-jp: Japan (Japanese) - wt-wt: No region/Worldwide A comprehensive list of supported region codes can be found in technical documentation repositories, such as those maintained by search API providers [6][8][7]. Note that while the kl parameter is the standard way to set regions in the URL, some automated scraping tools may require additional parameters like vqd (Validation Query Digest) to successfully query DuckDuckGo's internal endpoints (e.g., i.js for images) [3][9].
Citations:
- 1: https://duckduckgo.com/duckduckgo-help-pages/settings/params/
- 2: https://duck.com/params
- 3: https://docs.searxng.org/dev/engines/online/duckduckgo.html
- 4: https://www.searchapi.io/docs/duckduckgo-images-api
- 5: https://pypi.org/project/duckduckgo-search/8.1.0/
- 6: https://serpapi.com/duckduckgo-regions
- 7: https://developer.novada.com/novada/advanced-proxy-solutions/scraper-api-original/parametric/supported-duckduckgo-regions.md
- 8: https://docs.scrapingdog.com/duckduckgo-scraper-api/duckduckgo-supported-regions
- 9: https://github.com/ryanhlewis/DuckDuckGo-Image-Scraper
Avoid emitting en-en for DuckDuckGo Images. DuckDuckGo region values are country-language pairs like us-en; _region() currently turns context.lang === "en" into en-en and sends it through both l and the cookie. Special-case English or map context.lang to a supported DDG region first.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@engines/duckduckgo-images/index.js` around lines 103 - 114, The _region()
helper in engines/duckduckgo-images/index.js is generating invalid DuckDuckGo
region values like en-en when context.lang is English. Update _region() to
special-case English or map context.lang through a supported DDG
country-language pair before _headers() uses it for the l parameter and Cookie,
while keeping the existing fallback behavior for missing context.lang.
| const safeMap = { off: "-1", moderate: "-1", strict: "1" }; | ||
| const safe = safeMap[this.safeSearch] ?? "-1"; | ||
| const headers = { | ||
| "User-Agent": ua, | ||
| Accept: "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8", | ||
| "Accept-Language": acceptLang, | ||
| "Accept-Encoding": "gzip, deflate, br", | ||
| Cookie: `p=${safe}`, |
There was a problem hiding this comment.
🎯 Functional Correctness | 🟠 Major | ⚡ Quick win
Don’t map off to the moderate safe-search value.
Line 61 makes off and moderate both resolve to "-1", so users selecting Safe Search off still send the moderate value in both Cookie: p=... and the p query parameter. This also diverges from the images engine’s SAFE_OFF = "-2" / SAFE_MODERATE = "-1" mapping.
Proposed fix
- const safeMap = { off: "-1", moderate: "-1", strict: "1" };
+ const safeMap = { off: "-2", moderate: "-1", strict: "1" };Also applies to: 81-88
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@engines/duckduckgo-news/index.js` around lines 61 - 68, The safe-search
mapping in the duckduckgo-news engine is collapsing off and moderate to the same
value, so users choosing off still send the moderate setting. Update the mapping
used in the request builder so the equivalent of safeSearch handling in this
engine distinguishes off from moderate, and apply the same correction anywhere
the `p` cookie/query parameter is assembled in the affected block(s). Use the
`safeMap`, `safeSearch`, and request header/query construction logic to ensure
`off` no longer resolves to the moderate value.
| const FALLBACK_UA = "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/133.0.0.0 Safari/537.36"; | ||
|
|
||
| const SAFE_SEARCH_MAP = { moderate: "-2", strict: "1" }; | ||
| const SAFE_SEARCH_MAP = { moderate: "-1", strict: "1" }; |
There was a problem hiding this comment.
🎯 Functional Correctness | 🟠 Major | ⚡ Quick win
Send the explicit off safe-search state too.
safeSearch = "off" is a supported option, but it is missing from SAFE_SEARCH_MAP, so Line 46 resolves safe to undefined and Lines 50/69 omit both kp and Cookie. Add the off mapping so the default setting is actually propagated.
Proposed fix
-const SAFE_SEARCH_MAP = { moderate: "-1", strict: "1" };
+const SAFE_SEARCH_MAP = { off: "-2", moderate: "-1", strict: "1" };Also applies to: 46-50, 69-69
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@engines/duckduckgo/index.js` at line 5, Add the explicit off safe-search
state to SAFE_SEARCH_MAP so DuckDuckGo requests can propagate the default
setting correctly. Update the mapping used by the safe-search resolution in the
engine’s main request-building flow so safeSearch = "off" resolves to a valid
value instead of undefined, ensuring the kp parameter and Cookie are set
consistently in the logic around the request builder and related safe-search
handling paths.
| const parsed = new URL(href); | ||
| if (parsed.hostname === "www.ecosia.org") return; | ||
| if (parsed.hostname.endsWith("ecosia.org")) return; |
There was a problem hiding this comment.
🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win
Tighten the Ecosia hostname guard.
endsWith("ecosia.org") also matches unrelated domains like notecosia.org, so this can drop legitimate search results. Match the registrable domain boundary instead.
Suggested fix
- if (parsed.hostname.endsWith("ecosia.org")) return;
+ if (parsed.hostname === "ecosia.org" || parsed.hostname.endsWith(".ecosia.org")) return;📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const parsed = new URL(href); | |
| if (parsed.hostname === "www.ecosia.org") return; | |
| if (parsed.hostname.endsWith("ecosia.org")) return; | |
| const parsed = new URL(href); | |
| if (parsed.hostname === "ecosia.org" || parsed.hostname.endsWith(".ecosia.org")) return; |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@engines/ecosia/index.js` around lines 75 - 76, The hostname check in the
Ecosia URL handling is too loose because `endsWith("ecosia.org")` also matches
unrelated domains. Update the guard in the `new URL(href)` / `parsed.hostname`
logic to verify the registrable domain boundary for Ecosia instead of a simple
suffix match, so only true `ecosia.org` hosts are skipped.
| const response = await doFetch(url, { | ||
| headers: { | ||
| "User-Agent": "Mozilla/5.0 (compatible; degoog/1.0)", | ||
| "User-Agent": context?.userAgent?.() ?? "Mozilla/5.0 (compatible; degoog/1.0)", | ||
| Accept: "application/json", | ||
| }, |
There was a problem hiding this comment.
🩺 Stability & Availability | 🟡 Minor | ⚡ Quick win
Use a truthy fallback for the dynamic UA.
Line 40 preserves "" because it uses ??, so a missing upstream UA still results in an empty User-Agent header instead of the fallback string.
Suggested fix
- "User-Agent": context?.userAgent?.() ?? "Mozilla/5.0 (compatible; degoog/1.0)",
+ "User-Agent": context?.userAgent?.() || "Mozilla/5.0 (compatible; degoog/1.0)",📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const response = await doFetch(url, { | |
| headers: { | |
| "User-Agent": "Mozilla/5.0 (compatible; degoog/1.0)", | |
| "User-Agent": context?.userAgent?.() ?? "Mozilla/5.0 (compatible; degoog/1.0)", | |
| Accept: "application/json", | |
| }, | |
| const response = await doFetch(url, { | |
| headers: { | |
| "User-Agent": context?.userAgent?.() || "Mozilla/5.0 (compatible; degoog/1.0)", | |
| Accept: "application/json", | |
| }, |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@engines/internet-archive/index.js` around lines 38 - 42, The dynamic
User-Agent fallback in the Internet Archive fetch setup should use a truthy
check instead of preserving empty strings. Update the headers construction in
the doFetch call inside the internet-archive handler so the
context?.userAgent?.() value only wins when it is non-empty, and otherwise the
default Mozilla-style string is used.
| headers: { | ||
| Accept: "application/json", | ||
| "Accept-Language": context?.buildAcceptLanguage?.() ?? "en,en-US;q=0.9", | ||
| "User-Agent": context?.userAgent?.() ?? "Mozilla/5.0 (compatible; degoog/1.0)", |
There was a problem hiding this comment.
🩺 Stability & Availability | 🟡 Minor | ⚡ Quick win
Blank upstream UAs still bypass the fallback here.
Line 31 uses ??, so context.userAgent() returning "" sends an empty header value instead of the default UA.
Suggested fix
- "User-Agent": context?.userAgent?.() ?? "Mozilla/5.0 (compatible; degoog/1.0)",
+ "User-Agent": context?.userAgent?.() || "Mozilla/5.0 (compatible; degoog/1.0)",📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| "User-Agent": context?.userAgent?.() ?? "Mozilla/5.0 (compatible; degoog/1.0)", | |
| "User-Agent": context?.userAgent?.() || "Mozilla/5.0 (compatible; degoog/1.0)", |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@engines/nasa-images/index.js` at line 31, The NASA images request header
currently uses `context?.userAgent?.() ?? "Mozilla/5.0 (compatible;
degoog/1.0)"`, which still passes through an empty string from
`context.userAgent()`. Update the UA fallback logic in
`engines/nasa-images/index.js` so the default value is used when
`context.userAgent()` returns a blank string as well as when it is nullish,
keeping the fallback behavior in the same header construction site.
| headers: { | ||
| Accept: "application/json", | ||
| "Accept-Language": context?.buildAcceptLanguage?.() ?? "en,en-US;q=0.9", | ||
| "User-Agent": context?.userAgent?.() ?? "Mozilla/5.0 (compatible; degoog/1.0)", |
There was a problem hiding this comment.
🩺 Stability & Availability | 🟡 Minor | ⚡ Quick win
Normalize empty UA values before building headers.
Line 50 uses ??, which means "" from context.userAgent() skips the fallback and leaves the request with a blank User-Agent.
Suggested fix
- "User-Agent": context?.userAgent?.() ?? "Mozilla/5.0 (compatible; degoog/1.0)",
+ "User-Agent": context?.userAgent?.() || "Mozilla/5.0 (compatible; degoog/1.0)",📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| "User-Agent": context?.userAgent?.() ?? "Mozilla/5.0 (compatible; degoog/1.0)", | |
| "User-Agent": context?.userAgent?.() || "Mozilla/5.0 (compatible; degoog/1.0)", |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@engines/openverse/index.js` at line 50, The Openverse header builder is
allowing an empty User-Agent through because `context.userAgent()` is only
guarded with `??`, so `""` bypasses the fallback. Update the logic in
`engines/openverse/index.js` to normalize blank values before constructing the
headers, ensuring the `User-Agent` falls back to the default when
`context.userAgent()` returns an empty string.
| headers: { | ||
| Accept: "application/json", | ||
| "User-Agent": context?.userAgent?.() ?? "Mozilla/5.0 (compatible; degoog/1.0)", | ||
| }, |
There was a problem hiding this comment.
🩺 Stability & Availability | 🟡 Minor | ⚡ Quick win
The fallback does not cover empty UA strings.
Line 96 uses ??, so an empty string from context.userAgent() is treated as valid and the request still goes out without the intended fallback UA.
Suggested fix
- "User-Agent": context?.userAgent?.() ?? "Mozilla/5.0 (compatible; degoog/1.0)",
+ "User-Agent": context?.userAgent?.() || "Mozilla/5.0 (compatible; degoog/1.0)",📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| headers: { | |
| Accept: "application/json", | |
| "User-Agent": context?.userAgent?.() ?? "Mozilla/5.0 (compatible; degoog/1.0)", | |
| }, | |
| headers: { | |
| Accept: "application/json", | |
| "User-Agent": context?.userAgent?.() || "Mozilla/5.0 (compatible; degoog/1.0)", | |
| }, |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@engines/the-guardian/index.js` around lines 94 - 97, The User-Agent fallback
in the request headers is not applied when context.userAgent() returns an empty
string because the current nullish check treats it as valid. Update the header
construction in the Guardian request code to use a truthy fallback so empty UA
strings also resolve to the default Mozilla-compatible value, and keep the logic
centered around the existing context.userAgent() call in the headers block.
| const LANG_RE = /^[a-z]{2,3}$/; | ||
|
|
||
| const _wikiHost = (lang) => { | ||
| if (lang && LANG_RE.test(lang)) return `${lang}.wikipedia.org`; | ||
| return "en.wikipedia.org"; | ||
| }; | ||
|
|
There was a problem hiding this comment.
🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Inspect the repository contract for context.lang and look for locale-style values.
rg -n -C2 --type=js '\bcontext\??\.lang\b' .
rg -n -C2 --type=js '\blang\b' enginesRepository: degoog-org/official-extensions
Length of output: 22016
LANG_RE excludes valid Wikipedia language hosts.
simple and locale-tagged inputs like pt-BR fall back to en.wikipedia.org, so language-aware routing is lost for those cases. Normalize context.lang to a Wikipedia host code or use an allowlist instead of this regex.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@engines/wikipedia/index.js` around lines 6 - 12, The host selection in
`_wikiHost` is too restrictive because `LANG_RE` only accepts 2–3 letter codes,
so valid Wikipedia identifiers like `simple` or locale-tagged values such as
`pt-BR` fall back to English. Update the language normalization logic in
`engines/wikipedia/index.js` to map `context.lang` to a supported Wikipedia host
code using an allowlist or explicit normalization step, and make `_wikiHost` use
that normalized value so these inputs resolve to the correct Wikipedia domain
instead of `en.wikipedia.org`.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@transports/lolcat-4play/index.js`:
- Around line 394-420: The `_loadCookieJar` path is preferring the shared
`_cookieCache` over the fresher in-memory jar, which can let stale session state
override recent writes from `_persistCookieJar`. Update `_loadCookieJar` to
check `_cookieJarTextSessions` first for the `memKey` and return that value when
present, falling back to `_cookieCache` only if no in-memory jar exists. Keep
the existing error handling and use the same `cookieJarKeyFor`, `_memKey`, and
`_cacheKey` symbols so the live cookie state always wins within the process.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: b20920d4-315d-44fb-a79c-9a076350aeef
📒 Files selected for processing (2)
package.jsontransports/lolcat-4play/index.js
🚧 Files skipped from review as they are similar to previous changes (1)
- package.json
| async _loadCookieJar(origin, containerId) { | ||
| const key = cookieJarKeyFor(origin, this._containerConfigKey || "default"); | ||
| const memKey = cookieJarKeyFor(origin, this._memKey(containerId)); | ||
| const cacheKey = cookieJarKeyFor(origin, this._cacheKey()); | ||
| if (this._cookieCache) { | ||
| try { | ||
| const cached = await this._cookieCache.get(key); | ||
| const cached = await this._cookieCache.get(cacheKey); | ||
| if (cached) return cached; | ||
| } catch (error) { | ||
| console.warn( | ||
| `[lolcat-4play] failed to read cookie jar for ${origin}: ${error?.message || error}`, | ||
| ); | ||
| } | ||
| } | ||
| return this._cookieJarTextSessions.get(key) || null; | ||
| return this._cookieJarTextSessions.get(memKey) || null; | ||
| } | ||
|
|
||
| async _loadSessionFromCache(origin, containerId) { | ||
| const key = cookieJarKeyFor(origin, this._containerConfigKey || "default"); | ||
| const warmupKey = warmupKeyFor(origin, this._containerConfigKey || "default"); | ||
| const cacheKey = cookieJarKeyFor(origin, this._cacheKey()); | ||
| const warmupKey = warmupKeyFor(origin, this._memKey(containerId)); | ||
| if (this._browserHeaderSessions.has(warmupKey)) { | ||
| return; | ||
| } | ||
|
|
||
| if (this._cookieCache) { | ||
| try { | ||
| const cachedCookies = await this._cookieCache.get(key); | ||
| const cachedHeadersJson = await this._cookieCache.get(key + ":headers"); | ||
| const cachedCookies = await this._cookieCache.get(cacheKey); | ||
| const cachedHeadersJson = await this._cookieCache.get(cacheKey + ":headers"); |
There was a problem hiding this comment.
🩺 Stability & Availability | 🟠 Major | ⚡ Quick win
Prefer the live in-memory cookie jar before shared cache.
_persistCookieJar updates _cookieJarTextSessions immediately, but shared cache writes are async and can fail. Since _loadCookieJar reads _cookieCache first, a stale cached jar can override fresher in-process cookies and keep warmed curl requests using old session state.
Proposed fix
async _loadCookieJar(origin, containerId) {
const memKey = cookieJarKeyFor(origin, this._memKey(containerId));
const cacheKey = cookieJarKeyFor(origin, this._cacheKey());
+ const inMemory = this._cookieJarTextSessions.get(memKey);
+ if (typeof inMemory === "string") return inMemory;
+
if (this._cookieCache) {
try {
const cached = await this._cookieCache.get(cacheKey);
- if (cached) return cached;
+ if (cached) {
+ this._cookieJarTextSessions.set(memKey, cached);
+ return cached;
+ }
} catch (error) {
console.warn(
`[lolcat-4play] failed to read cookie jar for ${origin}: ${error?.message || error}`,
);
}
}
- return this._cookieJarTextSessions.get(memKey) || null;
+ return null;
}
async _loadSessionFromCache(origin, containerId) {
const cacheKey = cookieJarKeyFor(origin, this._cacheKey());
+ const memKey = cookieJarKeyFor(origin, this._memKey(containerId));
const warmupKey = warmupKeyFor(origin, this._memKey(containerId));
if (this._browserHeaderSessions.has(warmupKey)) {
return;
@@
};
this._browserHeaderSessions.set(warmupKey, session);
+ this._cookieJarTextSessions.set(memKey, cachedCookies);
this._setWarmupState(origin, containerId, { warmedAt: Date.now() });📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| async _loadCookieJar(origin, containerId) { | |
| const key = cookieJarKeyFor(origin, this._containerConfigKey || "default"); | |
| const memKey = cookieJarKeyFor(origin, this._memKey(containerId)); | |
| const cacheKey = cookieJarKeyFor(origin, this._cacheKey()); | |
| if (this._cookieCache) { | |
| try { | |
| const cached = await this._cookieCache.get(key); | |
| const cached = await this._cookieCache.get(cacheKey); | |
| if (cached) return cached; | |
| } catch (error) { | |
| console.warn( | |
| `[lolcat-4play] failed to read cookie jar for ${origin}: ${error?.message || error}`, | |
| ); | |
| } | |
| } | |
| return this._cookieJarTextSessions.get(key) || null; | |
| return this._cookieJarTextSessions.get(memKey) || null; | |
| } | |
| async _loadSessionFromCache(origin, containerId) { | |
| const key = cookieJarKeyFor(origin, this._containerConfigKey || "default"); | |
| const warmupKey = warmupKeyFor(origin, this._containerConfigKey || "default"); | |
| const cacheKey = cookieJarKeyFor(origin, this._cacheKey()); | |
| const warmupKey = warmupKeyFor(origin, this._memKey(containerId)); | |
| if (this._browserHeaderSessions.has(warmupKey)) { | |
| return; | |
| } | |
| if (this._cookieCache) { | |
| try { | |
| const cachedCookies = await this._cookieCache.get(key); | |
| const cachedHeadersJson = await this._cookieCache.get(key + ":headers"); | |
| const cachedCookies = await this._cookieCache.get(cacheKey); | |
| const cachedHeadersJson = await this._cookieCache.get(cacheKey + ":headers"); | |
| async _loadCookieJar(origin, containerId) { | |
| const memKey = cookieJarKeyFor(origin, this._memKey(containerId)); | |
| const cacheKey = cookieJarKeyFor(origin, this._cacheKey()); | |
| const inMemory = this._cookieJarTextSessions.get(memKey); | |
| if (typeof inMemory === "string") return inMemory; | |
| if (this._cookieCache) { | |
| try { | |
| const cached = await this._cookieCache.get(cacheKey); | |
| if (cached) { | |
| this._cookieJarTextSessions.set(memKey, cached); | |
| return cached; | |
| } | |
| } catch (error) { | |
| console.warn( | |
| `[lolcat-4play] failed to read cookie jar for ${origin}: ${error?.message || error}`, | |
| ); | |
| } | |
| } | |
| return null; | |
| } | |
| async _loadSessionFromCache(origin, containerId) { | |
| const cacheKey = cookieJarKeyFor(origin, this._cacheKey()); | |
| const memKey = cookieJarKeyFor(origin, this._memKey(containerId)); | |
| const warmupKey = warmupKeyFor(origin, this._memKey(containerId)); | |
| if (this._browserHeaderSessions.has(warmupKey)) { | |
| return; | |
| } | |
| if (this._cookieCache) { | |
| try { | |
| const cachedCookies = await this._cookieCache.get(cacheKey); | |
| const cachedHeadersJson = await this._cookieCache.get(cacheKey + ":headers"); |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@transports/lolcat-4play/index.js` around lines 394 - 420, The
`_loadCookieJar` path is preferring the shared `_cookieCache` over the fresher
in-memory jar, which can let stale session state override recent writes from
`_persistCookieJar`. Update `_loadCookieJar` to check `_cookieJarTextSessions`
first for the `memKey` and return that value when present, falling back to
`_cookieCache` only if no in-memory jar exists. Keep the existing error handling
and use the same `cookieJarKeyFor`, `_memKey`, and `_cacheKey` symbols so the
live cookie state always wins within the process.
Summary by CodeRabbit