fix(test-benchmark): nonce mismatch issue in SLOAD & SSTORE benchmark by LouisTsai-Csie · Pull Request #2617 · ethereum/execution-specs

LouisTsai-Csie · 2026-04-03T14:22:02Z

🗒️ Description

Here is the summary for the payload verification for test_sload_bloated and test_sstore_bloated.

Payload Information

Related Link

Network Information

Network: perf-devnet-3
Snapshot block height:
Account info (address, nonce)
- 1GB: (0x3F8074692982594c1936bd27433A8B6e5d77e0f0, 6)
- 10GB: (0x87A6314da5Ac8832F6e7A176C8FB133B19f5be04, 2)
- 20GB: (0x772604ee92EBc9AfA5B6CE561F6f6A4C4Cdd214a, 2)

Please check _STORAGE_BLOATED_EOA_KEYS variable under tests/benchmark/stateful/helpers.py, and convert the private key into the account address.

I verify the correctness with (1) the nonce being used and (2) the opcode count value. The verification script is as follows:

block.js
package.json

{
  "name": "block-parser",
  "version": "1.0.0",
  "type": "module",
  "description": "Script to parse and process blockchain blocks",
  "main": "block.js",
  "scripts": {
    "start": "node block.js"
  },
  "dependencies": {
    "@ethereumjs/tx": "^5.4.0",
    "@ethereumjs/util": "^9.1.0",
    "@ethereumjs/rlp": "^4.0.0"
  }
}

Please place them under the same folder path and run npm install. Once completed, update the filepath in the script and run npm start or node block.js

test_sstore_bloated

Configuration

fork: Osaka
existing_slot: False
write_new_value: False
token_name: 1GB
gas limit: 90M

Please check this generated-stateful-tests-stateful-perf-devnet-3-23949562466 for test_single_opcode.py__test_sstore_bloated[fork_Osaka-benchmark_test-existing_slots_False-write_new_value_False-token_name_1GB-benchmark_90M].txt file (under setup folder), i use it as an example.

In the block.js file, please update the filepath and points to the file mentioned above.

The result shows that the nonce in the authorization list starts at 6 for init_tx and 7 for runtime_tx, which matches the on-chain nonce value.

"authorizedList": [
	{
	  "chainId": "0x",
	  "address": "0x55e5b385b218a8a94d5766e423fb25e6ad9c9ffa",
	  "nonce": "0x06",
	  "yParity": "0x",
	  "r": "0xc1d8ef10fb2fb305ff83be4d5474c6661f1d0e256279ab952b9c6a814bf90bd3",
	  "s": "0x1bb7ca0248d101ba8f1ecf7d793ece69cd5d5a8fd6e0e24d5fe218b9d72256b1"
	}
],

From the opcode count file for this payload generation, the (SSTORE, SLOAD) opcode counts are (4049, 6), while the (SSTORE, SLOAD) opcode counts for the local fill mode run are (4074, 23). IMO this difference is reasonable as the local fill mode run includes the system contract interaction.

I have manually reviewed the other combinations for the test. The nonce value in the payloads match the ones on the live network.

test_sload_bloated

Configuration

fork: Osaka
existing_slot: False
write_new_value: False
token_name: 1GB
gas limit: 90M

Please check this generated-stateful-tests-stateful-perf-devnet-3-23949903761 file for test_single_opcode.py__test_sload_bloated[fork_Osaka-benchmark_test-existing_slots_False-token_name_1GB-benchmark_90M].txt file (under the setup folder). I've reviewed the nonce value using the same approach as test_sstore_bloated and ensure it is the same as the ones on the live network.

For the opcode count, the (SSTORE, SLOAD) pair of the payload is (41855, 6) while for local fill mode it is (41873, 30). I consider this is within the safe area, as the latter one includes the opcode count for system contract interaction.

How to locally fill the test? Please follow the example command:

uv run fill \
  -v \
  --clean \
  --evm-bin <evm-bin-path> \
  --gas-benchmark-values 90 \
  --fork Osaka \
tests/benchmark/stateful/bloatnet/test_single_opcode.py::test_sstore_bloated \
  --address-stubs tests/benchmark/stateful/stubs/stubs_bloatnet.json \
  --rpc-endpoint <eth-rpc-endpoint>

Postmortem

Why do we need this new live_eth_rpc fixture? And why does the previous test run not fetch the nonce value from the network?

EELS has two test modes: execute-remote, which provides an eth_rpc fixture to interact with a live network, and fill, which has no such fixture.

The previous implementation declared eth_rpc: EthRPC | None = None as a test parameter, expecting execute-remote to inject the live EthRPC instance and fill mode to fall back to None.

...
def test_sload_bloated(
    benchmark_test: BenchmarkTestFiller,
    ...
    existing_slots: bool,
    eth_rpc: EthRPC | None = None, # fixture passed here
) -> None:
...

However, because pytest treats a parameter with a default value as already satisfied, it never overrides the default with the actual fixture, so even in execute-remote mode, eth_rpc was always None and the on-chain nonce was never fetched. The workaround introduces a new live_eth_rpc bridge fixture that internally calls request.getfixturevalue("eth_rpc") to dynamically look up the real fixture when it exists (execute-remote) and returns None when it does not (fill mode); tests now declare live_eth_rpc without a default, forcing pytest to always resolve it. That said, this is not an ideal long-term approach for all remaining test cases, we need a mechanism that takes a private key as input, interacts with the corresponding on-chain account, and automatically manages nonce increments, since handling this manually is fragile.

🔗 Related Issues or PRs

N/A.

✅ Checklist

All: Ran fast static checks to avoid unnecessary CI fails, see also Code Standards and Enabling Pre-commit Checks:
```
just static
```
All: PR title adheres to the repo standard - it will be used as the squash commit message and should start type(scope):.
All: Considered updating the online docs in the ./docs/ directory.
All: Set appropriate labels for the changes (only maintainers can apply labels).
Tests: Ran mkdocs serve locally and verified the auto-generated docs for new tests in the Test Case Reference are correctly formatted.
Tests: For PRs implementing a missed test case, update the post-mortem document to add an entry the list.
Ported Tests: All converted JSON/YML tests from ethereum/tests or tests/static have been assigned @ported_from marker.

Cute Animal Picture

codecov · 2026-04-03T14:30:25Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 86.24%. Comparing base (4bf8bbe) to head (78a2094).

Additional details and impacted files

@@               Coverage Diff                @@
##           forks/amsterdam    #2617   +/-   ##
================================================
  Coverage            86.24%   86.24%           
================================================
  Files                  599      599           
  Lines                36984    36984           
  Branches              3795     3795           
================================================
  Hits                 31895    31895           
  Misses                4525     4525           
  Partials               564      564

Flag	Coverage Δ
unittests	`86.24% <ø> (ø)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

jochem-brouwer · 2026-04-03T15:40:04Z

I don´t understand, why does this seem to fix the issue?

LouisTsai-Csie · 2026-04-06T07:33:57Z

@jochem-brouwer i've updated the PR description, please take a look at the Postmortem part, the main reason is that we do not inject ethRPC into the file, which is breaking the framework.

jochem-brouwer · 2026-04-06T09:10:17Z

Cool. This makes sense! This is indeed an intermediate option to fix this now, we should look for a more robust option ASAP. Can we track this in an issue? This is specifically when fill vs execute mode diverges. We could also mark these type of tests to be filled with execute by default throwing a warning when we try to fill with "fill".

LouisTsai-Csie · 2026-04-06T09:45:17Z

@jochem-brouwer I have the PR for the refactor, please take a look if you are interested! With this approach, we could remove ethRPC instance and only rely on the stub integration.

jochem-brouwer · 2026-04-09T05:19:48Z

If this run was generated on changes of this PR: https://github.com/NethermindEth/gas-benchmarks/actions/runs/23949903761 then this benchmark now fills correctly, or is this also fixed by #2624?

LouisTsai-Csie marked this pull request as ready for review April 6, 2026 07:27

fix: eth rpc connection issue

78a2094

LouisTsai-Csie force-pushed the fix-auth-nonce branch from 673369f to 78a2094 Compare April 6, 2026 07:32

LouisTsai-Csie mentioned this pull request Apr 6, 2026

feat(tests-execute): EOA pkey support for stub account #2624

Merged

7 tasks

LouisTsai-Csie marked this pull request as draft April 7, 2026 13:06

LouisTsai-Csie mentioned this pull request Apr 7, 2026

Generate compute benchmark payloads on mainnet & perfnet environment ethpandaops/gas-lighting-tracker#44

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(test-benchmark): nonce mismatch issue in SLOAD & SSTORE benchmark#2617

fix(test-benchmark): nonce mismatch issue in SLOAD & SSTORE benchmark#2617
LouisTsai-Csie wants to merge 1 commit intoethereum:forks/amsterdamfrom
LouisTsai-Csie:fix-auth-nonce

LouisTsai-Csie commented Apr 3, 2026 •

edited

Loading

Uh oh!

codecov bot commented Apr 3, 2026 •

edited

Loading

Uh oh!

jochem-brouwer commented Apr 3, 2026

Uh oh!

LouisTsai-Csie commented Apr 6, 2026

Uh oh!

jochem-brouwer commented Apr 6, 2026

Uh oh!

LouisTsai-Csie commented Apr 6, 2026 •

edited

Loading

Uh oh!

jochem-brouwer commented Apr 9, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

LouisTsai-Csie commented Apr 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🗒️ Description

Payload Information

test_sstore_bloated

test_sload_bloated

Postmortem

🔗 Related Issues or PRs

✅ Checklist

Cute Animal Picture

Uh oh!

codecov bot commented Apr 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

jochem-brouwer commented Apr 3, 2026

Uh oh!

LouisTsai-Csie commented Apr 6, 2026

Uh oh!

jochem-brouwer commented Apr 6, 2026

Uh oh!

LouisTsai-Csie commented Apr 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jochem-brouwer commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

LouisTsai-Csie commented Apr 3, 2026 •

edited

Loading

codecov bot commented Apr 3, 2026 •

edited

Loading

LouisTsai-Csie commented Apr 6, 2026 •

edited

Loading

jochem-brouwer commented Apr 9, 2026 •

edited

Loading