Skip to content

Speed up secondary DB open by avoiding point-in-time MANIFEST recovery#14815

Open
dungeon-master-666 wants to merge 1 commit into
facebook:mainfrom
dungeon-master-666:fast-secondary-open
Open

Speed up secondary DB open by avoiding point-in-time MANIFEST recovery#14815
dungeon-master-666 wants to merge 1 commit into
facebook:mainfrom
dungeon-master-666:fast-secondary-open

Conversation

@dungeon-master-666

Copy link
Copy Markdown

Summary

Speed up DB::OpenAsSecondary() by avoiding point-in-time MANIFEST recovery on the initial secondary open when best_efforts_recovery is disabled.

The initial secondary recovery now replays the MANIFEST with a regular VersionEditHandler to build the final Version directly, verifies that all live SST files referenced by the recovered Version are present and have the expected size, and then initializes ManifestTailer from that recovered state for subsequent catch-up reads.

Motivation

Since #12928, secondary recovery goes through ManifestTailer, whose point-in-time recovery path tracks found/missing/intermediate files while replaying the MANIFEST. On large or long-lived MANIFEST files this can make opening a secondary DB take minutes.

This matches the regression reported in #14117, where opening as secondary is much slower than opening the same DB read-only.

Changes

  • Add a fast secondary MANIFEST recovery handler for initial ReactiveVersionSet::Recover().
  • Preserve existing ManifestTailer behavior for best_efforts_recovery.
  • Convert transient missing/corrupt referenced SST files during secondary open to Status::TryAgain().
  • Verify live SST file existence and size after fast recovery, including cases where table loading was skipped.
  • Initialize ManifestTailer after fast recovery so later ReadAndApply() calls continue tailing from the recovered state.
  • Preserve incomplete trailing atomic-group replay state across the handoff to ManifestTailer.

Testing

Added ReactiveVersionSetRecoverTest coverage for:

  • missing final SST returning TryAgain;
  • live-file verification when table loading is skipped;
  • preserving best-efforts point-in-time recovery behavior.

@meta-cla meta-cla Bot added the CLA Signed label Jun 3, 2026
@dungeon-master-666

Copy link
Copy Markdown
Author

@anand1976 @jowlyzhang can you take a look please? currently secondary mode is hardly usable on long-running rocksdb instances, it takes minutes to open.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant