Conversation
|
Could you merge |
|
Found a few issues while testing with pluto, so converted to draft |
Code ReviewSummaryThis PR adds the Findings[Critical] Cargo.toml workspace path typo — build errorImpact: The workspace will fail to build because the path points to a non-existent directory. [High] Max message size mismatch: parsigex uses 16 MB, Go uses 128 MBImpact: Any cluster where a [High] Handler only processes one inbound stream at a time — silent drop under concurrencyImpact: If a second inbound stream is fully negotiated while the first is still being read/verified, [Medium]
|
| Component | Go | Rust | Match | Notes |
|---|---|---|---|---|
| Protocol ID | /charon/parsigex/2.0.0 |
/charon/parsigex/2.0.0 |
yes | |
| Wire format | protobuf length-delimited | protobuf length-delimited | yes | |
| Max message size | 128 MB | 16 MB | no | See High finding above |
| DutyType integer mapping | 0–13 | 0–13 | yes | |
| Empty data set rejection | yes | yes | yes | InvalidParSignedDataSetFields |
| Duty validity check on recv | yes | yes | yes | duty_gater |
| Per-entry signature verification | yes | yes | yes | verifier callback |
| Concurrent inbound streams | yes (goroutine per stream) | no (single slot) | no | See High finding above |
| BuilderProposer rejected | yes (deprecated) |
yes (DeprecatedBuilderProposer) |
yes |
Tests
Tests were not run locally. The example (crates/parsigex/examples/parsigex.rs) provides an integration-level smoke test for the broadcast path but requires a live multi-node setup. Unit tests for encode_message/decode_message round-trips and codec edge cases (empty set, unknown duty type, oversized message) appear absent — these would be worth adding.
Open Questions
- Is the 16 MB cap intentional (memory-safety limit) or an oversight vs. Go's 128 MB?
- Was
.with_quic_enabled(true)on the relay node builder intentional? The other threeNode::new_*builders in the same file were not changed. SignedDataError::Customis only added insigneddata.rsbut never constructed in this PR — is it intended for follow-up work, or can it be deferred?
varex83agent
left a comment
There was a problem hiding this comment.
Review: feat: add parsigex
This PR ports the Charon parsigex libp2p protocol to Rust. The architecture is sound — the Behaviour/Handler split, the per-stream one-shot send/receive design, and the duty codec fallback order (Attestation before VersionedAttestation, SignedAggregateAndProof before versioned) all match Go correctly.
There are 3 bugs that must be fixed before merge:
- The codec is JSON-only — Go writes SSZ-first since v0.17, causing a hard interoperability failure for all major duty types (
crates/core/src/parsigex_codec.rs) DutySentinelencodes as14butTryFrom<i32>has no arm for14, making the round-trip fail withInvalidDuty(crates/core/src/types.rs)- Broadcast with no connected peers silently drops the
request_idwithout emitting a terminal event, leaving callers blocked forever (crates/parsigex/src/behaviour.rs:304)
Additionally there are 8 major issues: the peerinfo max-message-size regression (64 KiB → 128 MiB), pending_broadcasts leaking on connection close, unbounded tokio::spawn in notify_subscribers, FuturesUnordered with no concurrency cap, VerifyError being silently discarded, SignedDataError::Custom not being Send + Sync, a roundabout error conversion through serde_json::Error::io, and a panic!-eligible .expect() in non-test code. Inline comments are filed for each issue individually.
| /// Each subscriber is invoked in a spawned task since `poll()` is | ||
| /// synchronous. This matches Go's intended behaviour (see Go TODO to call | ||
| /// subscribers async). | ||
| fn notify_subscribers(&self, duty: Duty, data_set: ParSignedDataSet) { |
There was a problem hiding this comment.
Major: notify_subscribers spawns an unbounded number of Tokio tasks — DoS amplification and head-of-line blocking
Every verified inbound message calls tokio::spawn with no bound. Under sustained inbound traffic, the number of live tasks grows unboundedly, consuming memory proportional to message rate.
Additionally, all subscriber callbacks are awaited sequentially inside a single task. If any subscriber is slow (e.g., blocked on a back-pressured channel), all subsequent subscribers in the list are stalled for that entire duration.
Fix: feed messages through a bounded tokio::sync::mpsc channel to a dedicated delivery task. Back-pressure is then applied at the receive side rather than accumulating as spawned tasks:
// In Behaviour:
dispatch_tx: mpsc::Sender<(Duty, ParSignedDataSet)>, // bounded capacity
// In a dedicated spawned task:
while let Some((duty, set)) = rx.recv().await {
for sub in subs.read().await.iter() { sub(duty.clone(), set.clone()).await; }
}There was a problem hiding this comment.
Agree on the fact that sequential execution of subscriber callbacks is not good. Still, I believe that we can drop the entire subscribe mechanism given how Rust's libp2p already uses events.
There was a problem hiding this comment.
I would revisit it a little bit later. I've tried exploring other implementations, but came up that the current is the best way to go with. The main issue arise when we will use it in the exchanger, which heavily relies on the subscriber pattern
|
/claude-review |
|
Claude finished @varex83's task in 12m 36s —— View job PR #291 — parsigex Review
Verdict: REQUEST_CHANGES — 3 bugs + 7 major findings The architecture is sound: Behaviour/Handler split mirrors Charon, duty-codec fallback order (Attestation before VersionedAttestation, SignedAggregateAndProof before versioned) matches Go, and the protocol ID Bugs (must fix)1. When if targeted == 0 {
self.pending_events
.push_back(ToSwarm::GenerateEvent(Event::BroadcastFailed { request_id }));
return;
}2.
Fix: 3.
// Fix:
Custom(Box<dyn std::error::Error + Send + Sync>),Major Findings4.
Fix: handle 5. Every inbound stream pushes a Fix: check 6. Every verified inbound message unconditionally calls Fix: use a bounded 7. AGENTS.md prohibits pub fn encode_protobuf<M: Message>(message: &M) -> Vec<u8> {
message.encode_to_vec()
}8.
9. Converting 10. A Minor Findings
Nits
Parity Matrix
|
| /// Each subscriber is invoked in a spawned task since `poll()` is | ||
| /// synchronous. This matches Go's intended behaviour (see Go TODO to call | ||
| /// subscribers async). | ||
| fn notify_subscribers(&self, duty: Duty, data_set: ParSignedDataSet) { |
There was a problem hiding this comment.
Agree on the fact that sequential execution of subscriber callbacks is not good. Still, I believe that we can drop the entire subscribe mechanism given how Rust's libp2p already uses events.
iamquang95
left a comment
There was a problem hiding this comment.
Code LGTM, the only missing part is #322
Summary
Implements the partial signature exchange (parsigex) libp2p protocol — the mechanism by which distributed validator nodes share the partial BLS signatures they produce for each duty before threshold-combining them into a full aggregate.
New crate:
pluto-parsigexBehaviourNetworkBehaviourthat manages per-peer connections and routes inbound/outbound eventsHandlerConnectionHandler; opens an outbound substream per broadcast, accepts inbound substreams for receivesprotocol.rs/charon/parsigex/2.0.0)error.rsFailure(handler ↔ behaviour),VerifyError(caller-supplied verifier),Error(public)Handlebroadcast(duty, data_set)fans a signed set out to all connected peers;subscribe(cb)registers a callback for verified inbound setsReceive pipeline (handler side):
(Duty, ParSignedDataSet)DutyGaterVerifierEvent::Receivedto the behaviour on success,Event::Erroron any failureCore domain types (
crates/core)New types in
types.rswith full protobuf round-trip support:DutyType— enum covering all validator duties (attester, proposer, builder, randao, exit, sync, …)SlotNumber/Duty— slot + duty type pair that identifies a signing roundPubKey— 48-byte BLS public key newtypeParSignedData— aBox<dyn SignedData>paired with a share indexParSignedDataSet—HashMap<PubKey, ParSignedData>(one partial sig per validator key)New
parsigex_codec.rs— JSON serialise/deserialise for everySignedDatavariant, with type-erased dispatch viaAny::downcast_ref. Fallback order forAttesterandAggregatorduties matches Go's Charon for wire compatibility.Other changes
crates/p2p— wires parsigexBehaviourinto the combined pluto swarm behaviourcrates/peerinfo— removes the localMAX_MESSAGE_SIZEconstant in favour of the shared one frompluto-p2pTest plan
cargo test --workspace --all-featurescargo clippy --workspace --all-targets --all-features -- -D warningscargo +nightly fmt --all --checkparsigexexample end-to-end against a local cluster