Skip to content

enable optional header pruning#12293

Open
Hugo-Trentesaux wants to merge 1 commit into
paritytech:masterfrom
duniter:feat-header-pruning
Open

enable optional header pruning#12293
Hugo-Trentesaux wants to merge 1 commit into
paritytech:masterfrom
duniter:feat-header-pruning

Conversation

@Hugo-Trentesaux

Copy link
Copy Markdown
Contributor

In issue #7962, we suggested to add an option to prune headers. I do not know what is the policy concerning AI assisted contributions (not mentioned in contributing.md), but since there is a claude.md file, I think I can give it a try. So here is an AI-assisted merge request, and you'll find below the PR template completed by Claude:


Description

This PR adds optional header pruning to the database backend (sc-client-db), exposed
through a new --enable-header-pruning node CLI flag.

When a node runs with --blocks-pruning <N>, block bodies and most justifications of finalized
blocks outside the pruning window are removed, but their headers are kept forever. On an
otherwise fully-pruned validator/authoring node this is the only source of unbounded, monotonic
database growth: it is linear in the number of blocks regardless of activity (e.g. ~2 GB/year at a
6 s block time), which makes the disk footprint impossible to bound and eventually fills the disk of
unattended nodes.

This is the header part of the growth discussed in
#7962 ("Disk space usage is difficult to predict"), as suggested
in this comment.

Related to #7962.

When enabled, the headers (and their key-lookup entries) of pruned finalized blocks are removed as
well, so a pruned node's database size stays bounded for a constant-size state. Headers required for
warp sync are preserved automatically (see Review Notes).

Note

A node started with --enable-header-pruning can no longer answer queries for pruned historical
headers and therefore must not be used as an archive or full-sync source. It is intended for
unattended validator/authoring nodes with limited disk space. The flag only has an effect together
with a numeric --blocks-pruning N (it is a no-op in archive modes).

Integration

The change is backwards compatible: the new behavior is opt-in and defaults to false
everywhere, preserving the current behavior (headers kept).

Downstream code that constructs the affected structs directly must add the new field:

  • sc_client_db::DatabaseSettings gains a header_pruning: bool field.
  • sc_service::Configuration gains a header_pruning: bool field.
  • sc_cli::CliConfiguration gains a header_pruning() method (with a default impl that reads it
    from PruningParams, so most CLIs need no change).
  • sc_cli::PruningParams gains an enable_header_pruning: bool field (--enable-header-pruning).
 let settings = DatabaseSettings {
     trie_cache_maximum_size: ...,
     state_pruning: ...,
     source: ...,
     blocks_pruning: ...,
     pruning_filters: ...,
+    header_pruning: false, // keep previous behavior (headers are not pruned)
     metrics_registry: ...,
 };

Test helper change: the lowest-level helper Backend::new_test_with_tx_storage_source gained a
trailing header_pruning: bool argument, and a new convenience helper
Backend::new_test_with_tx_storage_filters_and_header_pruning was added.

Review Notes

Core change in substrate/client/db/src/lib.rs:

  • A new Backend::prune_header() is called at the end of prune_block() when
    header_pruning is enabled (after all body/justification lookups, which still need to resolve the
    block's lookup key). It:

    • removes the columns::HEADER entry,
    • removes the hash -> lookup_key mapping,
    • removes the number -> lookup_key mapping only if it still points to this block, so that
      pruning a stale fork at some height does not drop the canonical block's number mapping,
    • invalidates the in-memory header_cache and header_metadata cache.
  • Warp sync is preserved for free. Blocks retained by a PruningFilter (introduced in Do not prune blocks with Grandpa justifications #10893,
    e.g. GrandpaPruningFilter, which keeps blocks carrying GRANDPA justifications at authority-set
    changes) short-circuit prune_blocks before prune_block is reached, so their headers are
    never removed and warp-sync proof construction keeps working.

  • Pinned blocks are skipped: prune_header keeps the header of a pinned block. In practice the
    pruning window is far below the tip so pinned blocks are never reached, but this guards against it.

  • The flag is only effective with BlocksPruning::Some(n); in archive modes prune_block is never
    called, so the option is inert.

Plumbing: --enable-header-pruningPruningParams::header_pruning()
CliConfiguration::header_pruning()Configuration.header_pruningConfiguration::db_config()
DatabaseSettings.header_pruningBackend.header_pruning.

Tests added in sc-client-db:

  • header_pruning_on_finalize: verifies pruned blocks lose their header and number/hash lookups,
    while a filter-retained block and in-window blocks keep theirs.
  • header_pruning_disabled_keeps_headers: verifies the default behavior is unchanged (bodies pruned,
    headers kept).

TODO before merge:

  • Add a prdoc entry (/cmd prdoc).
  • Run /cmd fmt.

Checklist

  • My PR includes a detailed description as outlined in the "Description" and its two subsections above.
  • My PR follows the labeling requirements of this project (at minimum one label for T required)
    • Suggested: /cmd label T0-node I5-enhancement D2-substantial
    • (Do not add R0-no-crate-publish-required: sc-client-db/sc-cli/sc-service are published crates.)
  • I have made corresponding changes to the documentation (if applicable) — CLI flag is self-documented; add prdoc.
  • I have added tests that prove my fix is effective or that my feature works (if applicable)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant