feat: fetch chain config from remote URL and remove HTML scraping#2709
feat: fetch chain config from remote URL and remove HTML scraping#2709
Conversation
Removes bundled chain files (chains.json, extra-chains.json, sourcify-chains-default.json) and replaces them with a remote fetch from a URL configured via config.chains.remoteUrl. This decouples the server from chain data, enabling the separate sourcifyeth/sourcify-chains repo to own chain generation and trigger server restarts on updates. Loading priority: 1. src/sourcify-chains.json (self-hosted override, unchanged) 2. fetch from config.chains.remoteUrl at startup (3 retries, 3s delay) Changes: - Remove chains.json, extra-chains.json, sourcify-chains-default.json - Add config.chains.remoteUrl setting pointing to sourcifyeth/sourcify-chains - Refactor sourcify-chains.ts: export async initializeSourcifyChains() and a mutable sourcifyChainsMap populated at startup; pass etherscanApi to SourcifyChain constructor so /chains and Etherscan key lookup work - Wrap CLI startup in async IIFE; await initializeSourcifyChains() so getEtherscanApiKeyForEachChain() sees the populated chain map - ServerFixture: use LOCAL_CHAINS as default so tests don't need remote URL - chain-tests.spec.ts: replace static JSON import with async initializeSourcifyChains() + sourcifyChainsMap; use Mocha --delay - Add --delay to test:chains mocha invocation Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
After switching to async chain loading, sourcifyChainsMap is empty until initializeSourcifyChains() is called. Tests that access the map during registration (it() description strings) or during test execution now fail with "Cannot read properties of undefined". Fixes: - ServerFixture.before(): call initializeSourcifyChains() so all integration tests have the full chain map populated at runtime - etherscan.spec.ts: replace sourcifyChainsMap[id].name in it() description strings with testChainId (map is empty at registration time) - contract-creation-util.spec.ts: add before() calling initializeSourcifyChains() (test doesn't use ServerFixture) - verificationWorker.spec.ts: same — add initializeSourcifyChains() before the sourcifyChainsMap-based stub setup Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
@claude review |
|
Claude finished @kuzdogan's task in 3m 4s —— View job PR Review
Overall this is a well-motivated change — removing 57k lines of bundled JSON and the fragile HTML-scraping infrastructure is a clear win. The implementation is mostly clean but there are a few items worth addressing. Issues1. sourcifyChainsMap[chainId] = new SourcifyChain({ ... }); // chainId is numbervs. line 282: sourcifyChainsMap[chain.chainId.toString()] = chain; // explicit string
2. Unusual TypeScript non-null assertion pattern ( if (!chainsExtensions!) {The let chainsExtensions: SourcifyChainsExtensionsObjectWithHeaderEnvName | undefined;
// ... fetch loop ...
if (!chainsExtensions) {
throw new Error(...);
}3. Lost chain metadata from The old code merged new SourcifyChain({
name: extension.sourcifyName,
chainId,
supported,
rpcs,
etherscanApi,
fetchContractCreationTxUsing,
});Fields like 4. PR description vs. actual The PR description says "ServerFixture uses chains: fixtureOptions_?.chains || sourcifyChainsMap,And 5. No startup resilience / offline fallback Previously the server could start offline (chains were bundled). Now if the remote URL is unreachable after 3 retries, the server exits with a fatal error. This changes the operational profile significantly. Worth documenting as a known trade-off, and potentially worth adding a config option like 6. No integrity check on remote config The fetched JSON is parsed and used directly. If the GitHub raw URL is ever served with unexpected content (CDN misconfiguration, compromised file, etc.), there's no schema validation before Minor notes
What's working well
|
Fe v26.0.0-alpha.12 is dynamically linked against libssl.so.3, which is absent from the node:bookworm-slim base image. Add libssl3 to the production stage. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…hains Chains in additional-chains.json had no rpc entries in the generated sourcify-chains-default.json (fixed in sourcifyeth/sourcify-chains#50), causing SourcifyChain constructor to throw "No RPC provider was given" and crash server startup. Add a defensive check: if a supported chain ends up with zero usable RPCs after buildCustomRpcs() (e.g. all entries are APIKeyRPC and keys are absent in CI), log a warning and skip it rather than throwing. This is consistent with buildCustomRpcs() itself already silently skipping RPCs with missing API keys in non-production environments. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Chain 335 is in deprecated-chains.json and no longer has an active RPC or Blockscout explorer. The blockscoutApi creator tx lookup is already covered by the chain 40 (Telos) test. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Covers the previously-untested code paths in the new async chain config loader, bringing patch coverage from 66.67% toward the 70% target: - Local sourcify-chains.json override: loads chains, skips remote fetch - Remote fetch: happy path, missing remoteUrl config, HTTP error (404), retry on failure (succeeds on 2nd attempt), all retries exhausted - Chain building: map cleared on re-init, supported chains with no RPCs skipped, deprecated (supported:false) chains with no RPCs included, re-initialization replaces stale entries Uses sinon fake timers to avoid real 3 s retry delays in retry tests. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The test file added in the previous commit wasn't run through prettier because .husky/pre-commit was stored in git as 100644 (non-executable), causing git to silently skip it. Fix both: - Run prettier --write on sourcify-chains.spec.ts to pass CI lint check - chmod +x / git update-index --chmod=+x on .husky/pre-commit so the hook actually runs on future commits Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
This PR refactors chain configuration in the server to be initialized at startup from a remote sourcify-chains JSON URL (with an optional local override file), and removes the legacy Blockscout/Etherscan HTML scraping paths for contract creation transaction lookup.
Changes:
- Introduces
initializeSourcifyChains()and updates server/test entrypoints to initializesourcifyChainsMapasynchronously instead of importing bundled JSON. - Removes Blockscout/Etherscan scraping-based creator-tx lookup and associated types/tests.
- Updates chain test harness to use Mocha
--delayto allow async chain config initialization before test registration.
Reviewed changes
Copilot reviewed 15 out of 17 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| services/server/src/sourcify-chains.ts | Replaces bundled chain JSON imports with async initialization from local override or remote URL. |
| services/server/src/config/default.js | Adds chains.remoteUrl default pointing at the published remote chain list. |
| services/server/src/server/cli.ts | Makes CLI startup async to initialize chains before constructing the server and dependent services. |
| services/server/src/server/services/utils/contract-creation-util.ts | Removes scraping-based creator-tx lookup and leaves API-based methods. |
| packages/lib-sourcify/src/SourcifyChain/SourcifyChainTypes.ts | Removes blockscoutScrape / etherscanScrape from the supported method types. |
| services/server/test/unit/sourcify-chains.spec.ts | Adds unit tests covering remote fetch, retries, and local override behavior for chain initialization. |
| services/server/test/unit/verificationWorker.spec.ts | Updates tests to initialize chains before using sourcifyChainsMap. |
| services/server/test/unit/utils/contract-creation-util.spec.ts | Updates tests to initialize chains before building repositories from sourcifyChainsMap. |
| services/server/test/helpers/ServerFixture.ts | Initializes chains during fixture setup so server tests can use populated sourcifyChainsMap. |
| services/server/test/integration/apiv1/verification-handlers/etherscan.spec.ts | Avoids interpolating chain name at test-definition time (pre-init), uses chainId in test titles. |
| services/server/test/chains/chain-tests.spec.ts | Switches to async initialization + Mocha --delay with deferred test registration and explicit run(). |
| services/server/package.json | Updates test:chains to use --delay and align with deferred registration. |
| services/server/Dockerfile | Adds an additional libssl3 install step (currently duplicated). |
| .husky/pre-commit | Adds a pre-commit hook to run lint-staged. |
| services/server/src/sourcify-chains-default.json | Removes bundled default chains JSON from the server package. |
| services/server/src/extra-chains.json | Removes bundled extra chains JSON from the server package. |
Comments suppressed due to low confidence (1)
services/server/src/sourcify-chains.ts:17
dotenv.config()is called after importingconfig. SinceconfigreadsNODE_CONFIG_*and other environment variables at import time, any values coming from.envwon’t affect configuration resolution here. Consider loading dotenv before importingconfig, or (preferably) remove dotenv usage from this module and ensure the entrypoint loads env vars before importing anything that usesconfig.
import config from "config";
import dotenv from "dotenv";
dotenv.config();
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| before(async () => { | ||
| await initializeSourcifyChains(); | ||
|
|
There was a problem hiding this comment.
This unit test’s before hook calls initializeSourcifyChains(), which will fetch the remote chains config by default. Since this suite uses LocalChainFixture, pulling the full remote chain list is unnecessary and can make unit tests depend on network availability; consider stubbing fetch/chains.remoteUrl here or constructing a minimal sourcifyChainsMap from LOCAL_CHAINS for the workerData setup.
There was a problem hiding this comment.
can we simply use LOCAL_CHAINS ?
| // Chains config is loaded async; use Mocha's --delay + run() to defer test registration | ||
| (async () => { | ||
| await initializeSourcifyChains(); | ||
|
|
There was a problem hiding this comment.
The async IIFE that calls initializeSourcifyChains() isn’t wrapped in a .catch(...). If chain initialization fails (e.g. remote fetch error), run() won’t be called and Mocha (with --delay) can hang without executing or reporting tests. Add explicit error handling that logs the failure and exits (or otherwise ensures Mocha terminates).
There was a problem hiding this comment.
Not 100 % sure, but this sounds right.
| before(async () => { | ||
| await initializeSourcifyChains(); | ||
|
|
There was a problem hiding this comment.
ServerFixture now always calls initializeSourcifyChains() in the before hook, which (by default) fetches the chain config from the remote GitHub URL at test startup unless a local override file exists. This introduces an external network dependency and can make integration tests flaky/offline-unfriendly; consider defaulting the fixture to a local in-memory chain map (e.g. derived from LOCAL_CHAINS) or stubbing fetch/chains.remoteUrl for tests.
There was a problem hiding this comment.
Agreed. For testing, we shouldn't call this function and instead have some hardcoded chain map solely for testing.
manuelwedler
left a comment
There was a problem hiding this comment.
Will review the new chain repo next.
| before(async () => { | ||
| await initializeSourcifyChains(); | ||
|
|
There was a problem hiding this comment.
Agreed. For testing, we shouldn't call this function and instead have some hardcoded chain map solely for testing.
| before(async () => { | ||
| await initializeSourcifyChains(); | ||
|
|
There was a problem hiding this comment.
can we simply use LOCAL_CHAINS ?
| // Chains config is loaded async; use Mocha's --delay + run() to defer test registration | ||
| (async () => { | ||
| await initializeSourcifyChains(); | ||
|
|
There was a problem hiding this comment.
Not 100 % sure, but this sounds right.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Instead of mutating a module-level `sourcifyChainsMap` singleton, `initializeSourcifyChains()` now returns a fresh `SourcifyChainMap`. Key changes: - `sourcify-chains.ts`: remove exported `sourcifyChainsMap` const; return a local map from the function; drop the map-clearing loop; clean up the non-null assertion on `chainsExtensions`; use consistent string keys for all chainId lookups - `cli.ts`: capture the returned map in the async IIFE; make `getEtherscanApiKeyForEachChain` accept the map as a parameter - `ServerFixture`: store the returned map on `this.sourcifyChainsMap`; integration tests access it via the fixture instead of a module import - `chain-tests.spec.ts` / `etherscan-instances.spec.ts`: use the returned value directly; add `.catch()` to async IIFEs so failures exit instead of hanging Mocha with --delay - Unit tests: use returned value for assertions; remove the beforeEach map-clearing loop; rewrite the re-initialization test as two independent-map assertion Addresses Manuel's review comment on #2709. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* refactor: return SourcifyChainMap from initializeSourcifyChains Instead of mutating a module-level `sourcifyChainsMap` singleton, `initializeSourcifyChains()` now returns a fresh `SourcifyChainMap`. Key changes: - `sourcify-chains.ts`: remove exported `sourcifyChainsMap` const; return a local map from the function; drop the map-clearing loop; clean up the non-null assertion on `chainsExtensions`; use consistent string keys for all chainId lookups - `cli.ts`: capture the returned map in the async IIFE; make `getEtherscanApiKeyForEachChain` accept the map as a parameter - `ServerFixture`: store the returned map on `this.sourcifyChainsMap`; integration tests access it via the fixture instead of a module import - `chain-tests.spec.ts` / `etherscan-instances.spec.ts`: use the returned value directly; add `.catch()` to async IIFEs so failures exit instead of hanging Mocha with --delay - Unit tests: use returned value for assertions; remove the beforeEach map-clearing loop; rewrite the re-initialization test as two independent-map assertion Addresses Manuel's review comment on #2709. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * refactor: use chainIdStr directly as map key, avoid parseInt roundtrip Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * refactor(etherscan-instances): drop async IIFE, use serverFixture for chain map The async IIFE + run() pattern relied on Mocha's --delay flag which was never set in the test scripts, making test registration order a race condition. Replace it with a plain synchronous describe loop that uses only chainId at registration time. Chain names are recovered inside before() once serverFixture has initialized its sourcifyChainsMap, then written back onto the suite title so reporters (spec, mochawesome) still show the human-readable name. The supported-chain guard also moves into before() for the same reason. The "Double check" section now builds sourcifyChainsArray inside the it() block from serverFixture.sourcifyChainsMap, eliminating the last describe-time dependency on the chain map. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * refactor(etherscan-instances): inject chain name into it() titles via beforeEach it() titles are registered synchronously like describe() titles, so the chain name cannot be embedded at registration time. Store chainName in a closure variable (set by before()), then patch this.currentTest.title in beforeEach() by replacing the chainId placeholder with the full name. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(etherscan-instances): capture loop chainId in const to fix closure bug `chainId` is declared with `let` outside the for..in loop, so all describe/before/beforeEach/it closures that run asynchronously share the same variable and see the last iteration's value. Only the final chain (11155420) got the correct title replacement; all others saw the wrong chainId in the replace() call. Fix: capture the current value in `const id = chainId` at the top of each describe callback. All closures inside that describe now close over their own immutable `id`. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * chore: add comment to sourcifyChainsMap field and reference assertion - ServerFixture: comment that sourcifyChainsMap is assigned in before() and safe to read inside it/beforeEach - sourcify-chains.spec.ts: assert the two returned maps are distinct object references, not just disjoint in keys Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * refactor(ServerFixture): replace sourcifyChainsMap field with getter + fix fixtureOptions priority - Replace the `sourcifyChainsMap!` instance field with a getter that delegates to `this.server.chainRepository.sourcifyChainMap` — the field was redundant since the server already owns the chain map. - Create a local `sourcifyChainsMap` const in before() that respects `fixtureOptions_?.chains` first, then falls back to `initializeSourcifyChains()`. Both `serverOptions.chains` and `sourcifyChainMap` in VerificationServiceOptions now use the same value, fixing the inconsistency where the override was only applied to one of them. - Replace all `serverFixture.server.chainRepository.sourcifyChainMap` call sites across integration tests with `serverFixture.sourcifyChainsMap` for consistency. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * chore: use SourcifyChainMap type directly instead of Awaited<ReturnType> Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Remove `import config from "config"` and the top-level `dotenv.config()`
side-effects from sourcify-chains.ts. `initializeSourcifyChains` now
accepts `{ remoteUrl, retryAttempts?, retryDelayMs? }` so callers supply
the URL explicitly. cli.ts (the composition root) is the only place that
reads `chains.remoteUrl` from config; test call sites pass
`config.get("chains.remoteUrl")` or a stub URL directly.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replaces the GitHub raw URL with a GCP Storage bucket to avoid downtime during GitHub outages when scaling up new server instances. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…line stubs - ServerFixture defaults to LOCAL_CHAINS map instead of fetching remotely - verificationWorker.spec.ts iterates LOCAL_CHAINS directly - Etherscan v1/v2 integration tests use an inline Mainnet stub (chainId 1) - chain-tests.spec.ts passes its already-fetched map to ServerFixture (no double fetch) - etherscan-instances.spec.ts wraps in async IIFE + run() to fetch once explicitly Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… constructor SourcifyChain throws when supported:true and rpcs is empty. The dummy URL is never reached since Etherscan API calls are nock-mocked. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Summary
This PR makes two related changes to how Sourcify manages chain configuration:
1. Remote chain config
Removes the bundled
chains.json,sourcify-chains-default.json, andextra-chains.jsonfrom the server, replacing them with a remote fetch at startup from
sourcifyeth/sourcify-chains.
sourcify-chains-default.jsonfrom the remote URL on startupsourcify-chains.jsonoverride still supported for self-hosted instanceschains.remoteUrlindefault.jspoints to the published chain listnightly CI that pulls from QuickNode, dRPC, Etherscan, and Blockscout APIs
2. Remove blockscoutScrape and etherscanScrape
Removes both HTML-scraping methods for contract creation transaction lookup:
blockscoutApi, deprecated, or moved toadditional-chains.json(Bitkub Chain, Velas, Ozone Chain — low activity, no v2 API)With both removed, the entire scraping infrastructure is also deleted:
getScrapeContractCreationFetcher,getCreatorTxByScraping, regex constants.Test changes
chain-tests.spec.ts: replaced staticsourcify-chains-default.jsonimport with asyncinitializeSourcifyChains()+sourcifyChainsMap; uses Mocha--delayflagServerFixture: usesLOCAL_CHAINSas default chains so integration tests don't need a remote URLcontract-creation-util.spec.ts: removed two scrape tests (Blockscout old/new regex)sourcify-chains.spec.ts: new unit tests forinitializeSourcifyChainscovering thelocal override path, remote fetch (happy path, HTTP errors, retries, exhaustion), and
chain building edge cases (no RPCs, deprecated chains, re-initialization)
Other fixes
.husky/pre-commitexecutable bit: the hook was committed as100644(non-executable)so git was silently skipping
npx lint-stagedon every commit in this repo. Fixed bygit update-index --chmod=+x .husky/pre-commitso the hook now actually runs.🤖 Generated with Claude Code