Skip to content

fix(gnovm): reset package state between test functions#5577

Open
notJoon wants to merge 8 commits intognolang:masterfrom
notJoon:fix/gnovm-test-isolation-1982
Open

fix(gnovm): reset package state between test functions#5577
notJoon wants to merge 8 commits intognolang:masterfrom
notJoon:fix/gnovm-test-isolation-1982

Conversation

@notJoon
Copy link
Copy Markdown
Member

@notJoon notJoon commented Apr 23, 2026

Description

gno test previously shared a single PackageValue across every test function in a package, so mutation to package-level variables in one test leaked into the next. This PR runs each test against a freshly instantiated PackageValue created from the preprocessed PackageNode, inside its own nested transaction store. Package-level globals and init() side effects start from a clean slate for every test.

fixes #1982


Problem

The reproducer from the original issue:

package counter

var count int

func Increment() { count++ }

func TestIncrement1(t *testing.T) {
    Increment()
    if count != 1 { t.Fail() }
}

func TestIncrement2(t *testing.T) {
    Increment()
    if count != 1 { t.Fail() }
}

Both tests assume count starts at 0, so a single Increment() call should leave it at 1. On master branch, TestIncrement1 passes(count goes from 0 to 1), but TestIncrement2 failed: count carries over as 1 from the first test, Increment() takes it to 2.

Any test relying on a zero-initialized package global is fragile under the old behavior, with outcomes depending on which tests ran before it.

Why it happeneds on master

runTestFiles loaded the package once and reused the resulting PackageValue across every tf in the tests loop:

gno/gnovm/pkg/test/test.go

Lines 391 to 405 in 066f15a

// Load the test files into package and save.
// m.RunFiles(files.Files...)
for _, tf := range tests {
// TODO(morgan): we could theoretically use wrapping on the baseStore
// and gno store to achieve per-test isolation. However, that requires
// some deeper changes, as ideally we'd:
// - Run the MemPackage independently (so it can also be run as a
// consequence of an import)
// - Run the test files before this for loop (but persist it to store;
// RunFiles doesn't do that currently)
// - Wrap here.
m = Machine(tgs, opts.WriterForStore(), mpkg.Path, opts.Debug, store.NewInfiniteGasMeter())
m.Alloc = alloc.Reset()
m.SetActivePackage(pv)

pv.Block.Values holds the mutable package-level variables. Even though each iteration creates a new machine, SetActivePackage(pv) attaches the same pv, so mutations from test N remain visible in test N+1. The pre-existing TODO at that site already flagged this limitation, noting that a proper fix required a way to create a fresh PackageValue without re-parsing the source.

Fix requirements

The fix needs three properties:

  1. A fresh PackageValue for each test so package-level variables are re-initialized from their declarations and init() runs again.
  2. A clean baseline to instantiate against. *_test.gno init() functions that mutate imported-realm state (e.g. a test file calling dao.UpdateImpl to set allow-lists) must not leak into the shared seed, or every test would inherit that pollution before it even starts.
  3. Isolation at the store level so object mutations — both same-package globals and imported-realm state reached through crossing calls — do not bleed across tests either. This has to reach the baseStore, since realm finalization during a test writes through SetObjectbaseStore.Set and would otherwise persist into sibling tests via the shared cache.

Naively re-running RunMemPackage(mpkg, false) per test satisfies (1) but re-parses and preprocesses every file on every test, which regresses performance significantly (see Performance section below). The fix instead separates preprocessing from runtime instantiation: preprocess once, then hand the preprocessed PackageNode to each test to cheaply instantiate its own PackageValue inside a nested transaction.

Solution

Two changes.

1. gnovm/pkg/gnolang/machine.go

Split runFileDecls into a preprocesses phase and a runtime instantiation phase. The instantiation half becomes instantiatePackageFiles, which:

  • Allocates fresh file *Blocks on m.Package with values copies from each preprocessed FileNode.StaticBlock.
  • Calls PrepareNewValues to populate pv.Block.Values.
  • Runs top-level non function declarations via the existing topological sort.

Meanwhile, runFileDecls keeps its existing behavior (preprocess + initantiate in one call) by delegating its second half to instantiatePackageFiles, so nothing outside the test runner sees a behavior change. A new public method NewPackageInstance produces a fresh PackageValue from an already processed PackageNode:

// PATH: gnovm/pkg/gnolang/machine.go:L784
func (m *Machine) NewPackageInstance(pn *PackageNode) *PackageValue {
    pv := pn.NewPackage(m.Alloc)
    m.Store.SetCachePackage(pv)
    m.SetActivePackage(pv)
    m.instantiatePackageFiles(pn.FileSet.Files...)
    m.runInitFromUpdates(pv, pv.GetBlock(m.Store).Values)
    return pv
}

The explicit pv.GetBlock(m.Store).Values argument is load-bearing: pn.NewPackage already calls PrepareNewValues internally, so the second call inside instantiatePacageFiles sees ppl == ppl and returns nil. Feeding those nil updates to runInitFromUpdates would silently skip every init() function, so we pass the values that pn.NewPackage actually populated instead(related test: init_and_isolation.txtar).

2. gnovm/pkg/test/test.go

runTestFiles still preprocesses the package once, but now on the outer transaction store. The seed uses a new RunMemPackageSkipTestFileInits entry point so that init() functions in *_test.gno files are not executed against the shared tgs; file parsing, preprocessing, and top-level var declarations still run so that xxx_test integration imports continue to see test-file symbols. Each test then runs in its own nested transaction with a freshly instantiated PackageValue and a CacheWrap'd base store, so realm-finalization writes stay local to that test:

tmpkg := mpkg
if mptype, ok := mpkg.Type.(gno.MemPackageType); ok && mptype.IsAll() {
    tmpkg = gno.MPFTest.FilterMemPackage(mpkg)
}

m2 = Machine(tgs, ...)
m2.RunMemPackageSkipTestFileInits(tmpkg, true) // preprocess once, skip test-file init()
pn := tgs.GetBlockNode(gno.PackageNodeLocation(mpkg.Path)).(*gno.PackageNode)

for _, tf := range tests {
    innerBase := baseStore.CacheWrap()
    innerTxn := tgs.BeginTransaction(innerBase, innerBase, nil, nil)
    m = Machine(innerTxn, ..., store.NewInfiniteGasMeter())
    m.Alloc = alloc.Reset()
    pv := m.NewPackageInstance(pn)
    // run tf...
}

RunMemPackageSkipTestFileInits threads a skipTestFileInits bool through to runInitFromUpdates, where init.* FuncValues whose FileName matches IsTestFile are skipped inline alongside the existing init-detection check — no extra pass over updates, no duplicated predicate. NewPackageInstance and RunFiles call runInitFromUpdates with skipTestFileInits=false, so the per-test fresh pv still runs every init(), including test-file ones.

innerBase := baseStore.CacheWrap() gives each test its own overlay on top of tgs's base store, so realm finalization writes performed during the test (e.g. an imported realm's block being persisted at a crossing-call boundary) stay on innerBase. Dropping innerBase between iterations — we never call Write() on it — discards those writes, which is what keeps mutations to imported realms from bleeding between tests.

The tmpkg filter handles a type system characteristic: AsRunnable() demotes MPUser/StdlibAll to Prod, which strips *_test.gno files unless the package is pre-filtered with MPTest, The IsAll() guard avoids passing integration types through MPFTest, which panics on them since they already carry their test files.

How it works

Flow under the fix:

  1. Outer transaction tgs runs RunMemPackageSkipTestFileInits once, parsing and preprocessing the package. Non-test init() functions run on tgs as usual; test-file init() functions are deliberately skipped so they cannot mutate imported-realm state in the shared seed. The preprocessed PackageNode lives in tgs.cacheNodes.
  2. For each test, tgs.BeginTransaction(innerBase, innerBase, ...) returns innerTxn, a nested transaction with its own cacheObjects map for object writes and innerBase = baseStore.CacheWrap() layered on top of the outer base store so realm-finalization writes stay local. cacheNodes is shared via txlog.Wrap, so innerTxn sees the preprocessed pn without re-processing.
  3. m.NewPackageInstance(pn) creates a fresh PackageValue:
    • pn.NewPackage(m.Alloc) allocates a new pv.Block and seeds pv.Block.Values from pn.Values, so the init function values are already in place.
    • instantiatePackageFiles creates fresh file blocks and runs top-level var declarations in dependency order, re-initializing every package global from its source expression.
    • runInitFromUpdates(pv, pv.Block.Values) schedules those init() functions — including test-file ones — for execution against the clean per-test pv.
  4. The test runs against this fresh pv under innerTxn. Any mutations to package globals live in the new pv.Block.Values; any object writes go to innerTxn.cacheObjects, and realm finalizations that write through to the base store land on innerBase instead of tgs's base.
  5. When the test ends, innerTxn and innerBase are dropped without calling Write(), so both the cacheObjects overlay and the base-store overlay are discarded. The next iteration starts from scratch against tgs.

Under this flow, TestIncrement1 starts with count == 0, increments it to 1, and its mutation dies with innerTxn. TestIncrement2 then starts from a fresh pv where count == 0 again, and observes count == 1 as expected.

Performance

Re-running RunMemPackage per test regresses heavy packages by about 9x, because parse and preprocess repeat every test function. Splitting preprocess-once from instantiate per test brings the overhead down to a modest constant per test function.

Wall-clock measurement: gno test . on each package, median of 5 runs after warm-up run. master is 066f15a, branch is this PR. Both binaries are built with identical flags and run against the same sources, and subtest counts were verified equal between them.

package test funcs master branch ratio absolute delta
p/onbloc/uint256 39 0.53s 0.60s 1.13x +70ms
p/onbloc/int256 50 0.55s 0.64s 1.16x +90ms
p/onbloc/json 91 0.58s 0.74s 1.28x +160ms

The overhead is the cost of per-test PackageValue creation and re-running init(). Preprocessing happens once per package and is amortized across all tests. The measured overhead is consistent across the three packages -- roughly 1.7 to 1.8ms per test function -- so the cost scales with test function count rather than file size. Packages with expensive init() will see proportionally more.

Known limitation

  • Per-test PackageValue creation and init() re-execution is strictly more work than sharing a single PackageValue across tests. See the Performance section for measurements.
  • This isolation is stricter than Go's go test, which shares package-level globals across tests in the same binary. The stricter semantics are what issue [GnoVM] Reset machine context & realm state after each Test function #1982 asks for, but anyone porting tests from Go should know the expectations differ.
  • The design assumes PackageNode is immutable at runtime, which lets tgs.cacheNodes be shared across tests via txlog.Wrap. If a future change makes PackageNode mutable, this assumption needs to be revisited.
  • The fix targets test isolation inside runTestFiles. Filetests and other entrypoints still use runFileDecls directly and keep their prior semantics; those paths already run each test against its own package state, so no change was needed.
  • A handful of existing tests in examples/gno.land/r/ (sys/params, tests/vm/map_delete, demo/todolist, leon/hor, archive/nir1218_evaluation_proposal, gov/dao/v3/impl) were written against the old leaky behavior — e.g. expecting an item created in one test to persist into the next, or hard-coding proposal IDs that assumed state carried over from prior tests. They are updated in this PR to set up their own preconditions; new tests added after this change should follow the same pattern.

@Gno2D2
Copy link
Copy Markdown
Collaborator

Gno2D2 commented Apr 23, 2026

🛠 PR Checks Summary

🔴 Pending initial approval by a review team member, or review from tech-staff

Manual Checks (for Reviewers):
  • IGNORE the bot requirements for this PR (force green CI check)
Read More

🤖 This bot helps streamline PR reviews by verifying automated checks and providing guidance for contributors and reviewers.

✅ Automated Checks (for Contributors):

🟢 Maintainers must be able to edit this pull request (more info)
🔴 Pending initial approval by a review team member, or review from tech-staff

☑️ Contributor Actions:
  1. Fix any issues flagged by automated checks.
  2. Follow the Contributor Checklist to ensure your PR is ready for review.
    • Add new tests, or document why they are unnecessary.
    • Provide clear examples/screenshots, if necessary.
    • Update documentation, if required.
    • Ensure no breaking changes, or include BREAKING CHANGE notes.
    • Link related issues/PRs, where applicable.
☑️ Reviewer Actions:
  1. Complete manual checks for the PR, including the guidelines and additional checks if applicable.
📚 Resources:
Debug
Automated Checks
Maintainers must be able to edit this pull request (more info)

If

🟢 Condition met
└── 🟢 And
    ├── 🟢 The base branch matches this pattern: ^master$
    └── 🟢 The pull request was created from a fork (head branch repo: notJoon/gno-core)

Then

🟢 Requirement satisfied
└── 🟢 Maintainer can modify this pull request

Pending initial approval by a review team member, or review from tech-staff

If

🟢 Condition met
└── 🟢 And
    ├── 🟢 The base branch matches this pattern: ^master$
    └── 🟢 Not (🔴 Pull request author is a member of the team: tech-staff)

Then

🔴 Requirement not satisfied
└── 🔴 If
    ├── 🔴 Condition
    │   └── 🔴 Or
    │       ├── 🔴 At least one of these user(s) reviewed the pull request: [aronpark1007 davd-gzl jefft0 notJoon omarsy MikaelVallenet] (with state "APPROVED")
    │       ├── 🔴 At least 1 user(s) of the team tech-staff reviewed pull request
    │       └── 🔴 This pull request is a draft
    └── 🔴 Else
        └── 🔴 And
            ├── 🟢 This label is applied to pull request: review/triage-pending
            └── 🔴 On no pull request

Manual Checks
**IGNORE** the bot requirements for this PR (force green CI check)

If

🟢 Condition met
└── 🟢 On every pull request

Can be checked by

  • Any user with comment edit permission

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 23, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@notJoon notJoon changed the title Fix/gnovm test isolation 1982 fix(gnovm): reset package state between test functions Apr 24, 2026
@notJoon notJoon marked this pull request as ready for review April 24, 2026 03:40
@Gno2D2 Gno2D2 added the review/triage-pending PRs opened by external contributors that are waiting for the 1st review label Apr 24, 2026
notJoon and others added 2 commits April 24, 2026 14:03
Per-test `NewPackageInstance` gives each test a fresh `PackageValue` for
the package under test, but mutations to *imported* realms were leaking
through two paths:

1. The initial tgs seed ran test-file `init()` functions, persisting
   their cross-package effects (e.g. `gov/dao.allowedDAOs` set by a
   test's `InitWithUsers` call) into shared state before any test ran.
2. Per-test inner transactions shared their `baseStore` with `tgs`, so
   realm finalization writes (`SetObject` → `baseStore.Set`) propagated
   to tgs and to every sibling test.

Fix both:
- Add `Machine.RunMemPackageSkipTestFileInits`; `runMemPackage` drops
  init.* FuncValues from test files (via `IsTestFile`) before calling
  `runInitFromUpdates`. File parsing, preprocessing, and top-level var
  decls still run, so xxx_test integration imports see test-file symbols.
  Per-test machines re-run all inits on their fresh pv as before.
- `runTestFiles` CacheWraps the base/iavl stores per test iteration so
  realm finalization writes stay local to the inner txn; dropping
  innerBase without `Write()` discards the mutations.

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
@github-actions github-actions Bot added the 🧾 package/realm Tag used for new Realms or Packages. label Apr 24, 2026
@notJoon notJoon marked this pull request as draft April 24, 2026 05:48
@Gno2D2 Gno2D2 removed the review/triage-pending PRs opened by external contributors that are waiting for the 1st review label Apr 24, 2026
@notJoon notJoon marked this pull request as ready for review April 24, 2026 06:43
@Gno2D2 Gno2D2 added the review/triage-pending PRs opened by external contributors that are waiting for the 1st review label Apr 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

📦 🤖 gnovm Issues or PRs gnovm related 🧾 package/realm Tag used for new Realms or Packages. review/triage-pending PRs opened by external contributors that are waiting for the 1st review

Projects

Status: No status
Status: In Progress

Development

Successfully merging this pull request may close these issues.

[GnoVM] Reset machine context & realm state after each Test function

2 participants