Skip to content

feat: add test_helpers module (error_utils, test_utils) behind test_utlis flag#2381

Open
naor-starkware wants to merge 13 commits intomainfrom
naor/feat/add_test_helpers
Open

feat: add test_helpers module (error_utils, test_utils) behind test_utlis flag#2381
naor-starkware wants to merge 13 commits intomainfrom
naor/feat/add_test_helpers

Conversation

@naor-starkware
Copy link
Copy Markdown
Collaborator

@naor-starkware naor-starkware commented Apr 6, 2026

TITLE

Description

Description of the pull request changes and motivation.

Checklist

  • Linked to Github Issue
  • Unit tests added
  • Integration tests added.
  • This change requires new documentation.
    • Documentation has been added/updated.
    • CHANGELOG has been updated.

This change is Reviewable

naor-starkware and others added 9 commits April 6, 2026 20:39
…on_runner flag

- Create vm/src/test_helpers/ with error_utils.rs and test_utils.rs
- Move from cairo_test_suite/ (fix filename typo: utlis → utils)
- Fix crate:: import paths (were cairo_vm:: when outside the crate)
- Fix $crate in macro_export macro (clippy::crate_in_macro_def)
- Simplify load_cairo_program! path using with_file_name()
- Gate module behind function_runner feature in lib.rs

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ram! and error_utils checkers

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add AlwaysFailConversion helper + 2 tests for assert_mr_eq! unwrap_or_else
  panic branch (no-message and message variants)
- Allow clippy::result_large_err on hint_err test helper

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… noise

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…y function name error

#[macro_export] macros containing closures (|x| ...) cause llvm-cov to
emit a "function name is empty" error. Replaced unwrap_or_else(|e| panic!(...))
with match expressions to eliminate closures from macro expansions.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Follow-up to dropping the function_runner feature flag.
Gate test_helpers module and function_runner module under test_utils,
and update the doc comment in function_runner.rs accordingly.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Collaborator Author

naor-starkware commented Apr 6, 2026

@naor-starkware naor-starkware changed the title feat: add test_helpers module (error_utils, test_utils) behind function_runner flag feat: add test_helpers module (error_utils, test_utils) behind test_utlis flag Apr 6, 2026
@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 6, 2026

**Hyper Thereading Benchmark results**




hyperfine -r 2 -n "hyper_threading_main threads: 1" 'RAYON_NUM_THREADS=1 ./hyper_threading_main' -n "hyper_threading_pr threads: 1" 'RAYON_NUM_THREADS=1 ./hyper_threading_pr'
Benchmark 1: hyper_threading_main threads: 1
  Time (mean ± σ):     22.849 s ±  0.164 s    [User: 22.198 s, System: 0.649 s]
  Range (min … max):   22.733 s … 22.965 s    2 runs
 
Benchmark 2: hyper_threading_pr threads: 1
  Time (mean ± σ):     22.818 s ±  0.015 s    [User: 22.157 s, System: 0.658 s]
  Range (min … max):   22.807 s … 22.829 s    2 runs
 
Summary
  hyper_threading_pr threads: 1 ran
    1.00 ± 0.01 times faster than hyper_threading_main threads: 1




hyperfine -r 2 -n "hyper_threading_main threads: 2" 'RAYON_NUM_THREADS=2 ./hyper_threading_main' -n "hyper_threading_pr threads: 2" 'RAYON_NUM_THREADS=2 ./hyper_threading_pr'
Benchmark 1: hyper_threading_main threads: 2
  Time (mean ± σ):     12.376 s ±  0.028 s    [User: 22.389 s, System: 0.668 s]
  Range (min … max):   12.356 s … 12.396 s    2 runs
 
Benchmark 2: hyper_threading_pr threads: 2
  Time (mean ± σ):     12.331 s ±  0.067 s    [User: 22.442 s, System: 0.664 s]
  Range (min … max):   12.284 s … 12.378 s    2 runs
 
Summary
  hyper_threading_pr threads: 2 ran
    1.00 ± 0.01 times faster than hyper_threading_main threads: 2




hyperfine -r 2 -n "hyper_threading_main threads: 4" 'RAYON_NUM_THREADS=4 ./hyper_threading_main' -n "hyper_threading_pr threads: 4" 'RAYON_NUM_THREADS=4 ./hyper_threading_pr'
Benchmark 1: hyper_threading_main threads: 4
  Time (mean ± σ):      9.576 s ±  0.142 s    [User: 35.698 s, System: 0.781 s]
  Range (min … max):    9.475 s …  9.677 s    2 runs
 
Benchmark 2: hyper_threading_pr threads: 4
  Time (mean ± σ):      9.961 s ±  0.344 s    [User: 35.035 s, System: 0.738 s]
  Range (min … max):    9.718 s … 10.205 s    2 runs
 
Summary
  hyper_threading_main threads: 4 ran
    1.04 ± 0.04 times faster than hyper_threading_pr threads: 4




hyperfine -r 2 -n "hyper_threading_main threads: 6" 'RAYON_NUM_THREADS=6 ./hyper_threading_main' -n "hyper_threading_pr threads: 6" 'RAYON_NUM_THREADS=6 ./hyper_threading_pr'
Benchmark 1: hyper_threading_main threads: 6
  Time (mean ± σ):      9.900 s ±  0.135 s    [User: 35.448 s, System: 0.818 s]
  Range (min … max):    9.804 s …  9.995 s    2 runs
 
Benchmark 2: hyper_threading_pr threads: 6
  Time (mean ± σ):      9.641 s ±  0.013 s    [User: 35.812 s, System: 0.819 s]
  Range (min … max):    9.632 s …  9.651 s    2 runs
 
Summary
  hyper_threading_pr threads: 6 ran
    1.03 ± 0.01 times faster than hyper_threading_main threads: 6




hyperfine -r 2 -n "hyper_threading_main threads: 8" 'RAYON_NUM_THREADS=8 ./hyper_threading_main' -n "hyper_threading_pr threads: 8" 'RAYON_NUM_THREADS=8 ./hyper_threading_pr'
Benchmark 1: hyper_threading_main threads: 8
  Time (mean ± σ):      9.487 s ±  0.072 s    [User: 36.030 s, System: 0.794 s]
  Range (min … max):    9.437 s …  9.538 s    2 runs
 
Benchmark 2: hyper_threading_pr threads: 8
  Time (mean ± σ):      9.385 s ±  0.019 s    [User: 36.152 s, System: 0.758 s]
  Range (min … max):    9.372 s …  9.399 s    2 runs
 
Summary
  hyper_threading_pr threads: 8 ran
    1.01 ± 0.01 times faster than hyper_threading_main threads: 8




hyperfine -r 2 -n "hyper_threading_main threads: 16" 'RAYON_NUM_THREADS=16 ./hyper_threading_main' -n "hyper_threading_pr threads: 16" 'RAYON_NUM_THREADS=16 ./hyper_threading_pr'
Benchmark 1: hyper_threading_main threads: 16
  Time (mean ± σ):      9.687 s ±  0.257 s    [User: 36.190 s, System: 0.844 s]
  Range (min … max):    9.505 s …  9.868 s    2 runs
 
Benchmark 2: hyper_threading_pr threads: 16
  Time (mean ± σ):      9.526 s ±  0.124 s    [User: 35.971 s, System: 0.830 s]
  Range (min … max):    9.438 s …  9.614 s    2 runs
 
Summary
  hyper_threading_pr threads: 16 ran
    1.02 ± 0.03 times faster than hyper_threading_main threads: 16


Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@codecov
Copy link
Copy Markdown

codecov bot commented Apr 6, 2026

Codecov Report

❌ Patch coverage is 95.72650% with 5 lines in your changes missing coverage. Please review.
✅ Project coverage is 96.07%. Comparing base (f7ac327) to head (9b1b8f5).

Files with missing lines Patch % Lines
vm/src/test_helpers/error_utils.rs 95.18% 4 Missing ⚠️
vm/src/test_helpers/test_utils.rs 96.87% 1 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff            @@
##             main    #2381    +/-   ##
========================================
  Coverage   96.07%   96.07%            
========================================
  Files         105      107     +2     
  Lines       37737    37852   +115     
========================================
+ Hits        36254    36366   +112     
- Misses       1483     1486     +3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 6, 2026

Benchmark Results for unmodified programs 🚀

Command Mean [s] Min [s] Max [s] Relative
base big_factorial 2.133 ± 0.025 2.110 2.187 1.01 ± 0.01
head big_factorial 2.121 ± 0.008 2.105 2.129 1.00
Command Mean [s] Min [s] Max [s] Relative
base big_fibonacci 2.063 ± 0.015 2.049 2.100 1.00
head big_fibonacci 2.069 ± 0.019 2.051 2.116 1.00 ± 0.01
Command Mean [s] Min [s] Max [s] Relative
base blake2s_integration_benchmark 7.551 ± 0.197 7.402 8.054 1.01 ± 0.03
head blake2s_integration_benchmark 7.460 ± 0.064 7.400 7.621 1.00
Command Mean [s] Min [s] Max [s] Relative
base compare_arrays_200000 2.202 ± 0.017 2.176 2.230 1.00
head compare_arrays_200000 2.206 ± 0.012 2.190 2.228 1.00 ± 0.01
Command Mean [s] Min [s] Max [s] Relative
base dict_integration_benchmark 1.440 ± 0.005 1.434 1.446 1.00 ± 0.01
head dict_integration_benchmark 1.435 ± 0.008 1.426 1.450 1.00
Command Mean [s] Min [s] Max [s] Relative
base field_arithmetic_get_square_benchmark 1.234 ± 0.013 1.225 1.268 1.00 ± 0.01
head field_arithmetic_get_square_benchmark 1.233 ± 0.010 1.224 1.248 1.00
Command Mean [s] Min [s] Max [s] Relative
base integration_builtins 7.549 ± 0.044 7.505 7.646 1.00
head integration_builtins 7.578 ± 0.020 7.550 7.616 1.00 ± 0.01
Command Mean [s] Min [s] Max [s] Relative
base keccak_integration_benchmark 7.644 ± 0.030 7.603 7.714 1.00
head keccak_integration_benchmark 7.676 ± 0.030 7.613 7.706 1.00 ± 0.01
Command Mean [s] Min [s] Max [s] Relative
base linear_search 2.191 ± 0.009 2.179 2.205 1.00
head linear_search 2.235 ± 0.060 2.179 2.394 1.02 ± 0.03
Command Mean [s] Min [s] Max [s] Relative
base math_cmp_and_pow_integration_benchmark 1.520 ± 0.018 1.508 1.564 1.00
head math_cmp_and_pow_integration_benchmark 1.526 ± 0.008 1.518 1.548 1.00 ± 0.01
Command Mean [s] Min [s] Max [s] Relative
base math_integration_benchmark 1.483 ± 0.011 1.468 1.503 1.00
head math_integration_benchmark 1.484 ± 0.013 1.470 1.512 1.00 ± 0.01
Command Mean [s] Min [s] Max [s] Relative
base memory_integration_benchmark 1.234 ± 0.004 1.231 1.241 1.00
head memory_integration_benchmark 1.238 ± 0.027 1.224 1.313 1.00 ± 0.02
Command Mean [s] Min [s] Max [s] Relative
base operations_with_data_structures_benchmarks 1.552 ± 0.008 1.542 1.572 1.00
head operations_with_data_structures_benchmarks 1.572 ± 0.028 1.550 1.642 1.01 ± 0.02
Command Mean [ms] Min [ms] Max [ms] Relative
base pedersen 536.9 ± 2.4 533.3 541.5 1.00
head pedersen 537.1 ± 4.4 533.8 548.9 1.00 ± 0.01
Command Mean [ms] Min [ms] Max [ms] Relative
base poseidon_integration_benchmark 622.8 ± 7.5 614.3 635.5 1.00
head poseidon_integration_benchmark 629.2 ± 16.0 611.4 649.9 1.01 ± 0.03
Command Mean [s] Min [s] Max [s] Relative
base secp_integration_benchmark 1.854 ± 0.025 1.833 1.920 1.00
head secp_integration_benchmark 1.862 ± 0.024 1.823 1.912 1.00 ± 0.02
Command Mean [ms] Min [ms] Max [ms] Relative
base set_integration_benchmark 668.6 ± 3.0 664.6 674.8 1.00 ± 0.01
head set_integration_benchmark 668.3 ± 3.2 664.4 676.4 1.00
Command Mean [s] Min [s] Max [s] Relative
base uint256_integration_benchmark 4.252 ± 0.019 4.226 4.285 1.00
head uint256_integration_benchmark 4.262 ± 0.019 4.237 4.287 1.00 ± 0.01

@naor-starkware naor-starkware marked this pull request as ready for review April 9, 2026 06:06
Copy link
Copy Markdown
Collaborator

@YairVaknin-starkware YairVaknin-starkware left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@YairVaknin-starkware reviewed 9 files and all commit messages, and made 4 comments.
Reviewable status: all files reviewed, 4 unresolved discussions (waiting on naor-starkware).


vm/src/test_helpers/error_utils.rs line 44 at r2 (raw file):

/// Type alias for check functions that validate test results.
pub type VmCheck<T> = fn(&std::result::Result<T, CairoRunError>);

can just be Result, right? pls looks for similar instances where you can shorten.

Code quote:

&std::result::Result<T, CairoRunError>

vm/src/test_helpers/error_utils.rs line 62 at r2 (raw file):

}

/// Asserts that the result is `HintError::AssertNotEqualFail`.

This funcs have very repetitive boiler plate. Pls extract to a single func which these func will invoke that will get the res and the predicate u check.

Code quote:

/// Asserts that the result is `HintError::AssertNotEqualFail`.

vm/src/test_helpers/error_utils.rs line 220 at r2 (raw file):

    }

    /// `expect_hint_assert_not_zero` does not panic on `HintError::AssertNotZero`.

these error tests could be parameterized using rtest to reduce alot of repetitive boilerplate.


vm/src/test_helpers/test_utils.rs line 48 at r2 (raw file):

            Ok(v) => v,
            Err(e) => panic!("conversion to MaybeRelocatable failed: {e:?}"),
        };

pls factor out the conversion logic of both cases into a single func which also enforces coercion into MaybeRelocatable (currently it will work for any right that is able to be try_into'd left's type).

Code quote:

        let right_mr = match ($right).try_into() {
            Ok(v) => v,
            Err(e) => panic!("conversion to MaybeRelocatable failed: {e:?}"),
        };

naor-starkware and others added 3 commits April 9, 2026 12:13
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…nt::Internal

These errors arrive wrapped as Hint(Internal(...)) since they originate
inside hint execution, not as bare VirtualMachineError variants.
Remove now-unused expect_vm_error helper and vm_err test helper.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Collaborator Author

@naor-starkware naor-starkware left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@naor-starkware made 4 comments.
Reviewable status: 7 of 10 files reviewed, 4 unresolved discussions (waiting on YairVaknin-starkware).


vm/src/test_helpers/error_utils.rs line 44 at r2 (raw file):

Previously, YairVaknin-starkware wrote…

can just be Result, right? pls looks for similar instances where you can shorten.

Done.


vm/src/test_helpers/error_utils.rs line 62 at r2 (raw file):

Previously, YairVaknin-starkware wrote…

This funcs have very repetitive boiler plate. Pls extract to a single func which these func will invoke that will get the res and the predicate u check.

Done.


vm/src/test_helpers/error_utils.rs line 220 at r2 (raw file):

Previously, YairVaknin-starkware wrote…

these error tests could be parameterized using rtest to reduce alot of repetitive boilerplate.

Done.


vm/src/test_helpers/test_utils.rs line 48 at r2 (raw file):

Previously, YairVaknin-starkware wrote…

pls factor out the conversion logic of both cases into a single func which also enforces coercion into MaybeRelocatable (currently it will work for any right that is able to be try_into'd left's type).

Done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants