Conversation
Switch the Goldilocks field (p = 2^64 - 2^32 + 1) from Fp64<MontBackend<FConfig64, 1>> to SmallFp<FConfig64> using #[derive(SmallFpConfig)]. This leverages native u64 Montgomery arithmetic for 15-22% faster sumcheck and 4-12% faster NTT operations. Key changes: - fields.rs: SmallFp field definition with const fn goldilocks_mont() for compile-time Montgomery constant computation in Fp2/Fp3 extension configs - Cargo.toml: patch ark-ff/ark-std/ark-serialize to git master for SmallFp; update spongefish to patched version fixing BigInt<2> encoding/deserialization - cooley_tukey.rs: fix test using BigInt<2> (16 bytes) instead of BigInt<1> (8 bytes) Spongefish patches (in /tmp/spongefish-patched): - Fix impl_encoding! macro: drain leading zeros from BigInt<2>.to_bytes_be() - Fix impl_deserialize! macro: use BigInt::from_bits_be + from_bigint instead of from_be_bytes_mod_order (broken for SmallFp due to from_raw Montgomery bug)
… fix - spongefish: z-tech/spongefish rev 2613967 (encoding + NargDeserialize fixes for SmallFp extension fields) - ark-ff: algebra a2d4d660 (includes #1082: fix SmallFp from_random_bytes Montgomery confusion)
Merging this PR will not alter performance
Comparing Footnotes
|
| [patch.crates-io] | ||
| ark-ff = { git = "https://github.com/arkworks-rs/algebra" } | ||
| ark-std = { git = "https://github.com/arkworks-rs/std" } | ||
| ark-serialize = { git = "https://github.com/arkworks-rs/algebra" } |
There was a problem hiding this comment.
Let's wait until this is in a published ark-ff crate before merging?
I prefer not depending on main branches directly (even though we already do it for spongefish, but at least we pin a rev).
Depending on an unpegged main branch is pretty fragile. For example, everything would break if ark-std decides to finally upgrade their rand version.
There was a problem hiding this comment.
We'd be happy to get a list of items you think could be cleaned up if you want to chat btw.
| /// Since p ≈ 2^64, we have k = 64 and R = 2^64. | ||
| /// | ||
| /// Montgomery form of `v` is `v · R mod p`, computed in u128 to avoid overflow. | ||
| const fn goldilocks_mont(v: u64) -> u64 { |
There was a problem hiding this comment.
Why is this not part of SmallFp? I would expect an
impl SmallFp {
pub const fn from_u64(n: u64) -> Self;
}
or similar being generated by the macro that does exactly this.
It's also leaks the abstraction in that it assumes SmallFp uses Montogomery representation, which is not obvious for small field (i.e. M31 works better without IIRC).
There was a problem hiding this comment.
Fair point wrote the little helper here: arkworks-rs/algebra#1084
Also, we're happy to further optimize any Mersenne primes when/ if you have a specific need: https://andrewzitek.xyz/smallfp-site/
I'm confident this would be a reasonable change in the framework we've created.
There was a problem hiding this comment.
Also, we're happy to further optimize any Mersenne primes when/ if you have a specific need: https://andrewzitek.xyz/smallfp-site/
The main performance benefits from small fields come from using SIMD instructions (and their bigger cousins, GPUs). Unfortunately supporting this requires a very different field Trait than ark-ff::Field (I think Plonky3 got this part figured out well). In fact, we also found SIMD to be beneficial for large fields: doing 4 bn254 multiplications in parallel can be done 2x faster than 4 sequentially. In the Whir crate we are prepared to use such an API: all the large batch operations have dedicated parallel fns in the algebra module. This is where SIMD would apply. What's missing is ark-ff support.
One easy way to help us is if ark-ff::Field implements the zerocopy traits. Right now certain optimizations are blocked by not being able to cast from &[F] to &[[u64;LIMBS]]. Hashing a large vector of field elements for example. Mutable casting is a bit trickier as the values are no longer guaranteed reduced, but would be very useful to have as well (you can do this cleanly using a MaybeReduced field type, similar to MaybeInit). That would allow us to implement our own SIMD methods for example. It gives us an efficient and safe backdoor into the field internals. (Also vec![F::ZERO; large_size] is very slow right now, implementing zerocopy::FromZeros would give us a workaround).
This SIMDS support is critical to us (ProveKit). If we can't find a way to do this cleanly in ark-ff we will be forced to write our own field impls.
There was a problem hiding this comment.
Hi, thanks.
I've been able to squeeze good performance in my efficient sumcheck repo (maybe of independent interest btw) but I am transmuting the memory block there and I agree that zerocopy would be better. Here the idea is if the user calls the sumcheck lib with specific primes they get autodispatched into the vectorized path without ever knowing what that is or how it works.
That's kind of the vision for how vectorization would be ideally supported in Arkworks where the user is unaware like how is summarized on slide 20 here: https://andrewzitek.xyz/images/small_fp_slides.pdf#page=20 (slides are bit outdated otherwise btw). We have students working on this and it's of high interest to me.
That being said, there are things to work out before arriving there and if you're able to point me toward what functionality is most important for your efforts I can do my best to prioritize these. Ideally from my end, we could select some discrete pieces and collab.
Related but different:
I am also doing a rather large effort on Merkle Trees with both security and perfomance enchancements. I think it's relevant to your projects if you want a sneak peek lmk.
Generally speaking, the hope for all of these components (vectorization, sumcheck, vector commitments) is that they are easy to integrate and should well-support what you're doing. Appreciate the feedback and would like to have a closer loop in the future.
What does this PR do?
Results of running cargo bench
SmallFpvsFp64<MontBackend>for the Goldilocks field (--release.Interleaved RS Encode (median)
Fp64SmallFpSumcheck First Round (median)
Fp64SmallFp