feat(l1-sender): Add ability to resubmit transactions#1138
feat(l1-sender): Add ability to resubmit transactions#1138Artemka374 wants to merge 37 commits intomainfrom
Conversation
Each SendToL1 command now spawns an independent submit_and_confirm task that owns the full submit → poll → resubmit lifecycle via a oneshot channel. The main loop awaits completion in submission order, keeping nonce sequencing correct while allowing the poll/resubmit loop to run concurrently with fee-cap estimation for the next transaction. Key changes: - New `types.rs`: GasParams (EIP-1559, replacement bump logic) + Backoff - New `error.rs`: is_transient / is_nonce_too_low predicates - `config.rs`: transaction_timeout field (default 300 s) - `lib.rs`: per-tx task design; manual nonce tracking; blocking estimate_gas_within_caps; resubmission_action pure decision fn - Derive Clone on BatchEnvelope, CommitCommand, ProofCommand, ExecuteCommand - Derive Clone on LatencyDistributionTracker (fresh instance on clone) - Integration test: l1_sender_resubmits_after_timeout Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Anvil 1.5.1 reports a higher base priority fee (~2 gwei) than the default 1 gwei cap, causing estimate_gas_within_caps to spin forever. Override max_priority_fee_per_gas (10 gwei) and max_fee_per_gas (500 gwei) in the test setup so the cap never blocks submission. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ka374-claude/zksync-os-server into afo/l1-sender-enhancements
Test results278 tests 258 ✅ 2h 27m 22s ⏱️ For more details on these failures, see this check. Results for commit 17fb2e5. ♻️ This comment has been updated with latest results. |
Artemka374-claude
left a comment
There was a problem hiding this comment.
@Artemka374 requested me to review this PR.
I found a few issues in the new L1 sender resubmission path:
- Blob commit transactions are re-sent without increasing
maxFeePerBlobGas, so timeout-driven replacements can still be rejected by txpools that enforce blob-fee bumps. - Submission is now fully concurrent, which lets higher nonces get broadcast even if a lower nonce fails to submit; that is a behavioral regression from the old sequential submit path and interacts badly with the existing lack of in-flight recovery.
- The new integration test does not prove that a replacement happened; the original pending transaction would also finalize once auto-mining is turned back on.
Artemka374-claude
left a comment
There was a problem hiding this comment.
@Artemka374 requested me to review this PR.
I found one regression in the updated sender loop: the first submission now bypasses the configured L1 fee caps and can broadcast a transaction above the operator's stated max-fee settings. I left an inline comment on the relevant lines.
The earlier concurrency concern looks addressed, and the updated integration test is much stronger about proving that a replacement happened.
Artemka374-claude
left a comment
There was a problem hiding this comment.
@Artemka374 requested me to review this PR.
I still see two regressions in the new resubmission path:
- Blob commit transactions are retried with the same
maxFeePerBlobGas, so replacement commits can still be rejected as underpriced on txpools that enforce blob-fee bumps. - When the 10% replacement bump would exceed the configured EIP-1559 caps, the sender now re-watches the original transaction forever instead of surfacing a failure. A capped transaction that is already unmineable can therefore stall the pipeline indefinitely.
The earlier sequential-submit concern looks addressed, and the updated integration test is much stronger about proving that a replacement happened. I wasn't able to resolve the older threads because this bot account lacksResolveReviewThreadpermission.
Resolved conflict in lib/l1_sender/src/lib.rs: kept HEAD's generic type bounds and build_and_send refactor, incorporated main's commit_submitted_tx parameter and CommitWatcher notification logic. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Core Problem Fixed
Previously, a single missed/timed-out L1 transaction would crash the entire pipeline and require a manual restart.
Improvements
Automatic transaction resubmission: Each
SendToL1command now runs a dedicated submit-and-watch loop. On timeout, it re-estimates fees, applies a ≥10% EIP-1559 replacement bump, and rebroadcasts — repeating until a receipt is confirmed.Resilience to network conditions: The pipeline stays alive through network congestion and gas price spikes instead of crashing on the first confirmation timeout.
Fee capping on every attempt: All fee fields (
max_fee_per_gas,max_priority_fee_per_gas) are capped at the operator's configured maximums on each resubmission attempt. Blob fee cap is applied only for commit transactions that carry a sidecar.Cap-exceeded handling: If bumped fees would exceed configured caps, the sender re-watches the original transaction (instead of crashing) and increments a
tx_confirmation_timeoutscounter to alert operators that cap limits may need adjustment.Blob fee estimation: Blob fees are now explicitly estimated (with a fallback on estimation error) and reported via metrics.
Required confirmations: Added support for configurable required confirmations before a transaction is considered finalized.
New Observability (Grafana metrics)
l1_sender_configured_metrics are purely for observability of whether we are exceeding the caps.l1_sender_configured_max_fee_per_gasl1_sender_configured_max_priority_fee_per_gasl1_sender_configured_max_fee_per_blob_gasl1_sender_tx_confirmation_timeouts_total