Commit ae4b0a4
authored
[cuda] int4: stabilize two-layer decode test via CUDA-seeded init (#20196)
_make_int4_linear built the throwaway nn.Linear on CPU, so
reset_parameters() drew from the CPU RNG between the two layer
constructions and shifted the stream that seeds the quantized weights.
That pushed test_two_layer_mlp's genuine INT4 error from 0.1405 to
0.1556, crossing the 0.15 bound. Build the module with device=cuda so
init draws from the CUDA RNG, leaving the CPU stream (and the measured
error) deterministic. Test-only; dequant math is unchanged.1 parent 4519036 commit ae4b0a4
1 file changed
Lines changed: 4 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
59 | 59 | | |
60 | 60 | | |
61 | 61 | | |
62 | | - | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
63 | 66 | | |
64 | 67 | | |
65 | 68 | | |
| |||
0 commit comments