-
Notifications
You must be signed in to change notification settings - Fork 15
[AutoDiff] Autodiff 6: Adstack regression tests #491
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: duburcqa/split_adjoint_alloca_placement
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -320,3 +320,162 @@ | |
| compute.grad() | ||
|
|
||
| assert math.isnan(x.grad[None]) | ||
|
|
||
|
|
||
| def _run_basic_gradient(qd_dtype, n_iter, rel_tol, approx=test_utils.approx): | ||
| # Builds the kernel, runs forward + backward, and asserts a correct gradient. `approx` defaults to | ||
| # `test_utils.approx` which is correct for f32 (its backend-specific floor kicks in); f64 callers must pass | ||
| # `pytest.approx` to honor a tight `rel_tol=1e-14` that `test_utils.approx` would otherwise floor to 1e-6. | ||
| # The adstack is structurally required here so the backward compiler can reverse the dynamic `range(n_iter)` | ||
| # at all; the companion `test_adstack_basic_gradient_negative` pins that disabling the adstack raises | ||
| # `QuadrantsCompilationError("non static range")`. Value-correctness of the per-iteration `v` spilled on the | ||
| # adstack is NOT exercised by the linear body `v = v * 0.95 + 0.01` - see `test_adstack_basic_gradient`'s | ||
| # docstring for the details. | ||
| n = 4 | ||
| x = qd.field(qd_dtype, shape=n, needs_grad=True) | ||
| y = qd.field(qd_dtype, shape=(), needs_grad=True) | ||
|
|
||
| @qd.kernel | ||
| def compute(): | ||
| for i in x: | ||
| v = x[i] | ||
| for _ in range(n_iter): | ||
| v = v * 0.95 + 0.01 | ||
| y[None] += v | ||
|
|
||
| x_vals = [0.1, 0.3, 0.5, 0.8] | ||
| for i, v in enumerate(x_vals): | ||
| x[i] = v | ||
| y[None] = 0.0 | ||
| compute() | ||
| y.grad[None] = 1.0 | ||
| for i in range(n): | ||
| x.grad[i] = 0.0 | ||
|
|
||
| compute.grad() | ||
|
|
||
| # `v = v * 0.95 + 0.01` iterated n_iter times gives v_final = 0.95**n_iter * x[i] + const, so | ||
| # dv_final/dx[i] == 0.95**n_iter independent of x[i], and dy/dx[i] equals the same quantity. | ||
| expected = 0.95**n_iter | ||
| for i in range(n): | ||
| assert x.grad[i] == approx(expected, rel=rel_tol) | ||
|
|
||
|
|
||
| @pytest.mark.parametrize("n_iter", [1, 3, 10]) | ||
| @test_utils.test(require=qd.extension.adstack) | ||
| def test_adstack_basic_gradient(n_iter): | ||
| # Smallest possible "does reverse-mode AD through a for-loop work at all" check. The kernel runs `n_iter` | ||
| # iterations of `v = v * 0.95 + 0.01` per element and asserts that `dy/dx[i]` matches the analytical gradient | ||
| # `0.95 ** n_iter` for every element. | ||
| # | ||
| # Internal details: the adstack is structurally required here so the backward compiler can reverse the | ||
| # dynamic `range(n_iter)` at all - the companion `test_adstack_basic_gradient_negative` pins that disabling | ||
| # the adstack raises `QuadrantsCompilationError` in exactly this kernel shape. Value-correctness of the | ||
| # stored v, on the other hand, is NOT exercised: the loop body `v = v * 0.95 + 0.01` is linear, so the | ||
| # backward chain `adj(v_prev) = 0.95 * adj(v_next)` only uses the compile-time constant 0.95 and never reads | ||
| # v from the adstack. A broken push/load/pop that returned garbage for v would still produce the same exact | ||
| # gradient. For push/load/pop value-correctness coverage, see `test_adstack_unary_loop_carried` (non-linear | ||
| # unary ops in the loop body). `n_iter = 1` exercises the single-push adstack code path; `n_iter = 10` | ||
| # exercises repeated push/pop under one forward invocation; multi-element coverage (n = 4) guards against | ||
| # per-element accumulation bugs that a single-element variant would miss. | ||
| _run_basic_gradient(qd.f32, n_iter=n_iter, rel_tol=1e-6) | ||
|
|
||
|
|
||
| @pytest.mark.parametrize("n_iter", [1, 3, 10]) | ||
| @test_utils.test(require=[qd.extension.adstack, qd.extension.data64], default_fp=qd.f64) | ||
| def test_adstack_basic_gradient_f64(n_iter): | ||
| # f64 counterpart of `test_adstack_basic_gradient`. Uses `pytest.approx` so the tight `rel_tol=1e-14` is | ||
| # honored; `test_utils.approx` would floor it to `get_rel_eps()` (typically 1e-6) and silently pass an | ||
| # f32-precision regression. | ||
| _run_basic_gradient(qd.f64, n_iter=n_iter, rel_tol=1e-14, approx=pytest.approx) | ||
|
Check warning on line 390 in tests/python/test_adstack.py
|
||
|
Comment on lines
+384
to
+390
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🟡 pytest.approx(expected, rel=1e-14) uses a default abs=1e-12 floor, so for all expected values in both f64 tests—0.95**n_iter ∈ [~0.6, 0.95] in test_adstack_basic_gradient_f64 and integers {1, 3, 6, 10, 55} in test_adstack_sum_linear_f64—the effective tolerance is 1e-12 absolute rather than 1e-14 relative, making the docstring claim 'the tight rel_tol=1e-14 is honored' false by ~100×. Fix test_adstack_basic_gradient_f64 by passing abs=0 to pytest.approx; for test_adstack_sum_linear_f64 note that even with abs=0 the integer-valued expected gradients are exactly representable in f32, so a type-narrowing regression in the f64 backward pass would produce identical results and go undetected regardless of tolerance. Extended reasoning...What the bugs are and how they manifest pytest.approx's effective tolerance formula is max(|expected| * rel, abs), where abs defaults to 1e-12. In test_adstack_basic_gradient_f64, expected = 0.95**n_iter ranges from ~0.599 (n_iter=10) to 0.95 (n_iter=1), giving a rel component of at most 0.95 × 1e-14 = 9.5e-15, which is far below the 1e-12 abs floor. The floor therefore dominates for every parametrize value and the assertion is effectively pytest.approx(expected, abs=1e-12) — not rel=1e-14. The docstring at lines 387–389 states 'the tight rel_tol=1e-14 is honored', which is factually incorrect by roughly two orders of magnitude. The specific code path that triggers it _run_basic_gradient(qd.f64, …, rel_tol=1e-14, approx=pytest.approx) calls pytest.approx(expected, rel=1e-14) at line 370. pytest.approx.init computes self.expected * rel = 0.95 * 1e-14 ≈ 9.5e-15 as the relative component. The final tolerance is max(9.5e-15, 1e-12) = 1e-12. A backward pass that degrades to only 1e-13 relative precision (still 1e5× better than f32) would produce error ~6e-14 for expected ~0.6, which is below the 1e-12 floor and would pass silently. The integer-exactness compounding issue in test_adstack_sum_linear_f64 In _run_sum_linear the expected gradient is either float(n_iter) ∈ {1.0, 3.0, 10.0} or sum(a+1 for a in range(n_iter)) ∈ {1, 6, 55}. All of these are small integers exactly representable in IEEE 754 f32 (which handles integers exactly up to 2^24). Even with the abs floor, the rel component for the largest case is 55 × 1e-14 = 5.5e-13 < 1e-12, so the floor still dominates there too. More critically, a type-narrowing regression where the f64 backward pass silently computes in f32 would produce byte-identical integer results, and no tolerance tightness can detect it — the f64 and f32 values are indistinguishable. This contrasts with test_adstack_basic_gradient_f64 where expected = 0.95**n_iter; Python's float64(0.95) ≈ 0.9500000000000000222 differs from float32(0.95) ≈ 0.949999988, so at n_iter=10 the f32 vs f64 difference is ~1.26e-7, well above the 1e-12 floor, and a narrowing regression would be caught. Why existing code does not prevent it The PR author correctly identified that test_utils.approx floors rel to max(rel, get_rel_eps()) ≥ 1e-6, and switched to pytest.approx specifically to bypass that floor. The fix is sound in intent — pytest.approx does not apply the backend-specific floor. However, pytest.approx has its own unconditional abs=1e-12 default that was not accounted for. There is no pytest warning when the abs floor silently dominates a caller-specified rel tolerance. Impact Both tests still usefully verify that (a) the f64 code path compiles and runs, and (b) f32-precision regressions (~1e-7 absolute for basic_gradient) are caught — the 1e-12 abs floor is ~1e5× tighter than f32 error. What is lost is detection of sub-1e-12-absolute intermediate precision regressions (e.g., 1e-13 relative accuracy — much better than f32 but not full f64) and, for sum_linear, any type-narrowing regression whatsoever due to integer-exact expected values. The docstring's stated guarantee is therefore overstated by ~100× for basic_gradient, and entirely vacuous for sum_linear. How to fix it For test_adstack_basic_gradient_f64: pass abs=0 to pytest.approx — pytest.approx(expected, rel=1e-14, abs=0) — so the effective tolerance becomes |expected| × 1e-14 ≈ 6e-15 for expected ~0.6, genuinely tighter than f32's ~6e-8. For test_adstack_sum_linear_f64: the integer-exact expected gradients mean no tolerance can distinguish f32 from f64; the honest fix is to document that this test provides dtype-correctness coverage (the f64 code path compiles and runs) but not numerical-precision coverage, or to introduce a non-integer expected value (e.g., a fractional coefficient in the kernel) that differs between f32 and f64. Step-by-step proof for test_adstack_basic_gradient_f64 with n_iter=10
|
||
|
|
||
|
|
||
| @pytest.mark.parametrize("n_iter", [1, 3, 10]) | ||
| @test_utils.test(ad_stack_experimental_enabled=False) | ||
| def test_adstack_basic_gradient_negative(n_iter): | ||
| # Negative counterpart of `test_adstack_basic_gradient`: with the adstack disabled the backward compiler | ||
| # cannot reverse a dynamic `range(n_iter)`, so `compute.grad()` raises `QuadrantsCompilationError("Cannot use | ||
| # non static range in Backwards mode")` deterministically for every `n_iter`. Inlined rather than reusing | ||
| # `_run_basic_gradient` because the shall-not-pass path never reaches the gradient assertion, so a shared | ||
| # helper would carry a dead `rel_tol` argument down this branch. | ||
| n = 4 | ||
| x = qd.field(qd.f32, shape=n, needs_grad=True) | ||
| y = qd.field(qd.f32, shape=(), needs_grad=True) | ||
|
|
||
| @qd.kernel | ||
| def compute(): | ||
| for i in x: | ||
| v = x[i] | ||
| for _ in range(n_iter): | ||
| v = v * 0.95 + 0.01 | ||
| y[None] += v | ||
|
|
||
| x_vals = [0.1, 0.3, 0.5, 0.8] | ||
| for i, v in enumerate(x_vals): | ||
| x[i] = v | ||
| y[None] = 0.0 | ||
| compute() | ||
|
|
||
| with pytest.raises(qd.QuadrantsCompilationError, match=r"non static range"): | ||
| compute.grad() | ||
|
|
||
|
|
||
| def _run_sum_linear(qd_dtype, use_static_loop, use_varying_coeff, n_iter, rel_tol, approx=test_utils.approx): | ||
| n = 4 | ||
| x = qd.field(qd_dtype, shape=n, needs_grad=True) | ||
| y = qd.field(qd_dtype, shape=(), needs_grad=True) | ||
|
|
||
| @qd.kernel | ||
| def compute(): | ||
| for i in x: | ||
| v = x[i] | ||
| for a in qd.static(range(n_iter)) if qd.static(use_static_loop) else range(n_iter): | ||
| if qd.static(use_varying_coeff): | ||
| y[None] += v * qd.cast(a + 1, qd_dtype) | ||
| else: | ||
| y[None] += v | ||
|
claude[bot] marked this conversation as resolved.
|
||
|
|
||
| x_vals = [0.1, 0.3, 0.5, 0.8] | ||
| for i, v in enumerate(x_vals): | ||
| x[i] = v | ||
| y[None] = 0.0 | ||
| compute() | ||
| y.grad[None] = 1.0 | ||
| for i in range(n): | ||
| x.grad[i] = 0.0 | ||
| compute.grad() | ||
|
|
||
| expected = sum(a + 1 for a in range(n_iter)) if use_varying_coeff else float(n_iter) | ||
| for i in range(n): | ||
| assert x.grad[i] == approx(expected, rel=rel_tol) | ||
|
|
||
|
|
||
| @pytest.mark.parametrize("n_iter", [1, 3, 10]) | ||
| @pytest.mark.parametrize("use_static_loop", [True, False]) | ||
| @pytest.mark.parametrize("use_varying_coeff", [True, False]) | ||
| @test_utils.test(require=qd.extension.adstack) | ||
| def test_adstack_sum_linear(use_static_loop, use_varying_coeff, n_iter): | ||
| # Linear accumulation `y = sum_j v * coeff_j` across all four combinations of (static-unrolled vs dynamic loop) | ||
| # x (constant coefficient vs loop-index-varying coefficient), at three loop lengths. Replaces the earlier three | ||
| # separate tests (`test_adstack_sum_fixed_coeff`, `test_adstack_sum_constant_coeffs`, | ||
| # `test_adstack_sum_static_loop_correct`) with a single parametrized version so every branch of that truth | ||
| # table is covered at each trip count. | ||
| # | ||
| # Internal details: this test deliberately does not mutate `v` inside the loop, so the reverse pass does not | ||
| # require adstack replay of `v` to compute the right gradient - `v`'s per-iteration value is the same `x[i]`. | ||
| # The point of this test is therefore not to stress the adstack (that is `test_adstack_basic_gradient`'s job) | ||
| # but to prove that enabling the adstack extension does not silently regress linear reverse-mode AD for either | ||
| # unrolled or dynamic loop shapes. No negative counterpart is included: for `use_static_loop=True` the inner | ||
| # loop is unrolled and the backward kernel contains no dynamic range, so the adstack option does not change | ||
| # the gradient; for `use_static_loop=False` disabling the adstack would raise `QuadrantsCompilationError` | ||
| # (same compile-time rejection covered by `test_adstack_basic_gradient_negative`), which is out of scope here. | ||
| _run_sum_linear(qd.f32, use_static_loop, use_varying_coeff, n_iter, rel_tol=1e-6) | ||
|
|
||
|
|
||
| @pytest.mark.parametrize("n_iter", [1, 3, 10]) | ||
| @pytest.mark.parametrize("use_static_loop", [True, False]) | ||
| @pytest.mark.parametrize("use_varying_coeff", [True, False]) | ||
| @test_utils.test(require=[qd.extension.adstack, qd.extension.data64], default_fp=qd.f64) | ||
| def test_adstack_sum_linear_f64(use_static_loop, use_varying_coeff, n_iter): | ||
| # f64 counterpart uses `pytest.approx` so the tight rel_tol is not floored by `test_utils.approx`. | ||
| _run_sum_linear(qd.f64, use_static_loop, use_varying_coeff, n_iter, rel_tol=1e-14, approx=pytest.approx) | ||
Uh oh!
There was an error while loading. Please reload this page.