enable optional header pruning#12293
Open
Hugo-Trentesaux wants to merge 1 commit into
Open
Conversation
84229ba to
d412bad
Compare
d412bad to
e8d40d3
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
In issue #7962, we suggested to add an option to prune headers. I do not know what is the policy concerning AI assisted contributions (not mentioned in contributing.md), but since there is a claude.md file, I think I can give it a try. So here is an AI-assisted merge request, and you'll find below the PR template completed by Claude:
Description
This PR adds optional header pruning to the database backend (
sc-client-db), exposedthrough a new
--enable-header-pruningnode CLI flag.When a node runs with
--blocks-pruning <N>, block bodies and most justifications of finalizedblocks outside the pruning window are removed, but their headers are kept forever. On an
otherwise fully-pruned validator/authoring node this is the only source of unbounded, monotonic
database growth: it is linear in the number of blocks regardless of activity (e.g. ~2 GB/year at a
6 s block time), which makes the disk footprint impossible to bound and eventually fills the disk of
unattended nodes.
This is the header part of the growth discussed in
#7962 ("Disk space usage is difficult to predict"), as suggested
in this comment.
Related to #7962.
When enabled, the headers (and their key-lookup entries) of pruned finalized blocks are removed as
well, so a pruned node's database size stays bounded for a constant-size state. Headers required for
warp sync are preserved automatically (see Review Notes).
Note
A node started with
--enable-header-pruningcan no longer answer queries for pruned historicalheaders and therefore must not be used as an archive or full-sync source. It is intended for
unattended validator/authoring nodes with limited disk space. The flag only has an effect together
with a numeric
--blocks-pruning N(it is a no-op in archive modes).Integration
The change is backwards compatible: the new behavior is opt-in and defaults to
falseeverywhere, preserving the current behavior (headers kept).
Downstream code that constructs the affected structs directly must add the new field:
sc_client_db::DatabaseSettingsgains aheader_pruning: boolfield.sc_service::Configurationgains aheader_pruning: boolfield.sc_cli::CliConfigurationgains aheader_pruning()method (with a default impl that reads itfrom
PruningParams, so most CLIs need no change).sc_cli::PruningParamsgains anenable_header_pruning: boolfield (--enable-header-pruning).let settings = DatabaseSettings { trie_cache_maximum_size: ..., state_pruning: ..., source: ..., blocks_pruning: ..., pruning_filters: ..., + header_pruning: false, // keep previous behavior (headers are not pruned) metrics_registry: ..., };Test helper change: the lowest-level helper
Backend::new_test_with_tx_storage_sourcegained atrailing
header_pruning: boolargument, and a new convenience helperBackend::new_test_with_tx_storage_filters_and_header_pruningwas added.Review Notes
Core change in
substrate/client/db/src/lib.rs:A new
Backend::prune_header()is called at the end ofprune_block()whenheader_pruningis enabled (after all body/justification lookups, which still need to resolve theblock's lookup key). It:
columns::HEADERentry,hash -> lookup_keymapping,number -> lookup_keymapping only if it still points to this block, so thatpruning a stale fork at some height does not drop the canonical block's number mapping,
header_cacheandheader_metadatacache.Warp sync is preserved for free. Blocks retained by a
PruningFilter(introduced in Do not prune blocks with Grandpa justifications #10893,e.g.
GrandpaPruningFilter, which keeps blocks carrying GRANDPA justifications at authority-setchanges) short-circuit
prune_blocksbeforeprune_blockis reached, so their headers arenever removed and warp-sync proof construction keeps working.
Pinned blocks are skipped:
prune_headerkeeps the header of a pinned block. In practice thepruning window is far below the tip so pinned blocks are never reached, but this guards against it.
The flag is only effective with
BlocksPruning::Some(n); in archive modesprune_blockis nevercalled, so the option is inert.
Plumbing:
--enable-header-pruning→PruningParams::header_pruning()→CliConfiguration::header_pruning()→Configuration.header_pruning→Configuration::db_config()→
DatabaseSettings.header_pruning→Backend.header_pruning.Tests added in
sc-client-db:header_pruning_on_finalize: verifies pruned blocks lose their header and number/hash lookups,while a filter-retained block and in-window blocks keep theirs.
header_pruning_disabled_keeps_headers: verifies the default behavior is unchanged (bodies pruned,headers kept).
TODO before merge:
prdocentry (/cmd prdoc)./cmd fmt.Checklist
Trequired)/cmd label T0-node I5-enhancement D2-substantialR0-no-crate-publish-required:sc-client-db/sc-cli/sc-serviceare published crates.)