Skip to content

Split ingest external files into prepare and commit APIs (#14849)#14849

Closed
joshkang97 wants to merge 1 commit into
facebook:mainfrom
joshkang97:export-D108225105
Closed

Split ingest external files into prepare and commit APIs (#14849)#14849
joshkang97 wants to merge 1 commit into
facebook:mainfrom
joshkang97:export-D108225105

Conversation

@joshkang97

@joshkang97 joshkang97 commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Summary:

This change splits ingestion into two public calls: PrepareFileIngestion() performs all of the off-mutex work and returns an opaque FileIngestionHandle, and CommitFileIngestionHandle() makes the prepared files visible under the mutex.

Even though the "Prepare" phase was off the DB mutex, the application layer may still have application logic that depends on the completion of IngestExternalFile.

IngestExternalFiles(args) is also just PrepareFileIngestion(args) followed by CommitFileIngestionHandle(), so its behavior is unchanged.

CommitFileIngestionHandles() also supports committing multiple handles atomically. However, it is possible that a CF may be present in multiple handles. In this case, we merge the ingestion jobs together. This allows applications to prepare file ingestions at different times, but still provide a single atomic commit.

Differential Revision: D108225105

@meta-cla meta-cla Bot added the CLA Signed label Jun 11, 2026
@meta-codesync

meta-codesync Bot commented Jun 11, 2026

Copy link
Copy Markdown

@joshkang97 has exported this pull request. If you are a Meta employee, you can view the originating Diff in D108225105.

@github-actions

github-actions Bot commented Jun 11, 2026

Copy link
Copy Markdown

✅ clang-tidy: No findings on changed lines

Completed in 462.1s.

@meta-codesync meta-codesync Bot changed the title Split ingest external files into prepare and commit APIs Split ingest external files into prepare and commit APIs (#14849) Jun 11, 2026
joshkang97 added a commit to joshkang97/rocksdb that referenced this pull request Jun 11, 2026
)

Summary:

This change splits ingestion into two public calls: `PrepareFileIngestion()` performs all of the off-mutex work and returns an opaque `FileIngestionHandle`, and `CommitFileIngestionHandle()` makes the prepared files visible under the mutex.

Even though the "Prepare" phase was off the DB mutex, the application layer may still have application logic that depends on the completion of `IngestExternalFile`.

`IngestExternalFiles(args)` is also just `PrepareFileIngestion(args)` followed by `CommitFileIngestionHandle()`, so its behavior is unchanged.

`CommitFileIngestionHandles()` also supports committing multiple handles atomically. However, it is possible that a CF may be present in multiple handles. In this case, we merge the ingestion jobs together. This allows applications to prepare file ingestions at different times, but still provide a single atomic commit.

Differential Revision: D108225105
joshkang97 added a commit to joshkang97/rocksdb that referenced this pull request Jun 11, 2026
)

Summary:

This change splits ingestion into two public calls: `PrepareFileIngestion()` performs all of the off-mutex work and returns an opaque `FileIngestionHandle`, and `CommitFileIngestionHandle()` makes the prepared files visible under the mutex.

Even though the "Prepare" phase was off the DB mutex, the application layer may still have application logic that depends on the completion of `IngestExternalFile`.

`IngestExternalFiles(args)` is also just `PrepareFileIngestion(args)` followed by `CommitFileIngestionHandle()`, so its behavior is unchanged.

`CommitFileIngestionHandles()` also supports committing multiple handles atomically. However, it is possible that a CF may be present in multiple handles. In this case, we merge the ingestion jobs together. This allows applications to prepare file ingestions at different times, but still provide a single atomic commit.

Differential Revision: D108225105
joshkang97 added a commit to joshkang97/rocksdb that referenced this pull request Jun 11, 2026
)

Summary:

This change splits ingestion into two public calls: `PrepareFileIngestion()` performs all of the off-mutex work and returns an opaque `FileIngestionHandle`, and `CommitFileIngestionHandle()` makes the prepared files visible under the mutex.

Even though the "Prepare" phase was off the DB mutex, the application layer may still have application logic that depends on the completion of `IngestExternalFile`.

`IngestExternalFiles(args)` is also just `PrepareFileIngestion(args)` followed by `CommitFileIngestionHandle()`, so its behavior is unchanged.

`CommitFileIngestionHandles()` also supports committing multiple handles atomically. However, it is possible that a CF may be present in multiple handles. In this case, we merge the ingestion jobs together. This allows applications to prepare file ingestions at different times, but still provide a single atomic commit.

Differential Revision: D108225105
joshkang97 added a commit to joshkang97/rocksdb that referenced this pull request Jun 11, 2026
)

Summary:

This change splits ingestion into two public calls: `PrepareFileIngestion()` performs all of the off-mutex work and returns an opaque `FileIngestionHandle`, and `CommitFileIngestionHandle()` makes the prepared files visible under the mutex.

Even though the "Prepare" phase was off the DB mutex, the application layer may still have application logic that depends on the completion of `IngestExternalFile`.

`IngestExternalFiles(args)` is also just `PrepareFileIngestion(args)` followed by `CommitFileIngestionHandle()`, so its behavior is unchanged.

`CommitFileIngestionHandles()` also supports committing multiple handles atomically. However, it is possible that a CF may be present in multiple handles. In this case, we merge the ingestion jobs together. This allows applications to prepare file ingestions at different times, but still provide a single atomic commit.

Differential Revision: D108225105
@github-actions

github-actions Bot commented Jun 11, 2026

Copy link
Copy Markdown

Codex Code Review - OBSOLETE

Superseded by a newer AI review. Expand to see the original review.

🟡 Codex Code Review

Auto-triggered after CI passed — reviewing commit 31acb42


Codex review failed before producing findings.

WARNING: proceeding, even though we could not create PATH aliases: Refusing to create helper binaries under temporary dir "/tmp" (codex_home: AbsolutePathBuf("/tmp/codex-home"))
error: the argument '--base <BRANCH>' cannot be used with '[PROMPT]'

Usage: codex exec review --commit <SHA> --base <BRANCH> --title <TITLE> --model <MODEL> --config <key=value> --dangerously-bypass-approvals-and-sandbox --output-last-message <FILE> [PROMPT]

For more information, try '--help'.

ℹ️ About this response

Generated by Codex CLI.
Review methodology: claude_md/code_review.md

Limitations:

  • Codex may miss context from files not in the diff
  • Large PRs may be truncated
  • Always apply human judgment to AI suggestions

Commands:

  • /codex-review [context] — Request a code review
  • /codex-query <question> — Ask about the PR or codebase

@github-actions

github-actions Bot commented Jun 11, 2026

Copy link
Copy Markdown

Claude Code Review - OBSOLETE

Superseded by a newer AI review. Expand to see the original review.

✅ Claude Code Review

Auto-triggered after CI passed — reviewing commit 31acb42


Summary

Solid two-phase API design that correctly splits the ingestion hot path. The core Prepare/Commit split and handle lifecycle are well-implemented, with good test coverage via parameterized tests. Several issues need attention before merge.

High-severity findings (3):

  • [db_impl_readonly.h / db_impl_secondary.h] ReadOnly and Secondary DB instances inherit PrepareFileIngestion()/CommitFileIngestionHandles() from DBImpl without NotSupported overrides, allowing ingestion on read-only databases.
  • [db_impl.cc:~835] CloseHelper asserts num_outstanding_prepared_ingestions_ == 0 but never waits for it; in release builds a held handle causes use-after-free when its destructor accesses the destroyed DB.
  • [db_impl.cc:CommitFileIngestionHandles] fill_cache consistency is not validated across multiple handles at commit time; hs[0]->fill_cache_ is used unconditionally, silently ignoring differing values from other handles targeting different CFs.
Full review (click to expand)

Findings

🔴 HIGH

H1. Missing NotSupported overrides in ReadOnly / Secondary DB — db_impl_readonly.h, db_impl_secondary.h
  • Issue: DBImplReadOnly overrides IngestExternalFile() to return NotSupported, but does NOT override the new PrepareFileIngestion() or CommitFileIngestionHandles(). Since both inherit from DBImpl, calling these new methods on a read-only or secondary instance will attempt actual ingestion, violating the read-only contract. Same for DBImplSecondary.
  • Root cause: The PR added overrides in StackableDB and ModelDB but missed DBImplReadOnly and DBImplSecondary.
  • Suggested fix: Add NotSupported overrides for PrepareFileIngestion() and CommitFileIngestionHandles() in both DBImplReadOnly and DBImplSecondary, matching the existing IngestExternalFile pattern.
H2. CloseHelper assert without wait — db_impl.cc:~835
  • Issue: CloseHelper() asserts num_outstanding_prepared_ingestions_.load() == 0 but does not wait for it to reach zero. If a user holds a FileIngestionHandle when Close() is called: (1) in debug builds, the assert fires; (2) in release builds, the assert is elided, Close() proceeds to destroy DB internals, and the handle's destructor later accesses freed memory (use-after-free via RollbackPreparedFileIngestion accessing mutex_, immutable_db_options_.info_log, etc.).
  • Root cause: The API contract says "A handle MUST NOT outlive the DB" but this is only documented, not enforced.
  • Suggested fix: Either (a) add a wait loop in CloseHelper similar to the bg work wait, or (b) have CloseHelper cancel all outstanding handles (set a flag they check), or (c) at minimum, log an error in release builds before the assert. Option (a) is safest.
H3. fill_cache not validated across handles at commit — db_impl.cc:CommitFileIngestionHandles
  • Issue: The old IngestExternalFiles validated that all args had the same fill_cache value. The new CommitFileIngestionHandles uses hs[0]->fill_cache_ for the LogAndApply read options without validating consistency across all handles. When committing handles targeting different CFs with different fill_cache settings, the first handle's value silently wins.
  • Root cause: The fill_cache validation in PrepareFileIngestion only checks within a single prepare call's args. Cross-handle consistency is not checked at commit time.
  • Suggested fix: Add validation in CommitFileIngestionHandles that all handles have the same fill_cache_ value, or document the semantic that the first handle's value is used.

🟡 MEDIUM

M1. Debug-only validation for cross-DB handle misuse — db_impl.cc:CommitFileIngestionHandles
  • Issue: CommitFileIngestionHandles uses assert(h->db_ == this) to verify handles belong to this DB. In release builds, passing a handle from DB_A to DB_B silently proceeds with static_cast, causing undefined behavior as DB_A's job internals are accessed through DB_B's commit path.
  • Suggested fix: Add a runtime check: if (h->db_ != this) return Status::InvalidArgument(...).
M2. nullptr handle in commit vector — db_impl.cc:CommitFileIngestionHandles
  • Issue: The code has assert(handle) for each element but no runtime check. A nullptr in the vector crashes in release builds.
  • Suggested fix: Add a runtime null check before the cast.
M3. Metric behavioral change — db_impl.cc:PrepareFileIngestion / CommitFileIngestionHandles
  • Issue: The old code recorded INGEST_EXTERNAL_FILE_PREPARE_TIME only on final success. The new code records it in PrepareFileIngestion() even if Commit later fails. This is a subtle behavioral change that could affect monitoring dashboards. The PERF_TIMER_GUARD(file_ingestion_nanos) is only in IngestExternalFiles(), so direct PrepareFileIngestion() + CommitFileIngestionHandle() calls don't record the overall perf timer.
  • Suggested fix: Document this as an intentional change. Consider adding PERF_TIMER_GUARD to the commit path as well.
M4. MergeForSameColumnFamily leaves source job in partially valid state — external_sst_file_ingestion_job.cc
  • Issue: After merge, other->files_to_ingest_ is cleared but other->edit_, other->max_assigned_seqno_, and other fields retain stale values. When the commit fails and the handle destructor calls Cleanup() on the emptied job, Cleanup() calls DeleteInternalFiles() which iterates the now-empty files_to_ingest_ — this is safe (no-op), but the stale edit_ could theoretically cause issues if any code path inspects it.
  • Suggested fix: Consider clearing edit_ and max_assigned_seqno_ in the merged-from job for defense in depth.
M5. ingestion_options_ changed from reference to value — external_sst_file_ingestion_job.h:410
  • Issue: Changing from const IngestExternalFileOptions& to const IngestExternalFileOptions is necessary for the handle's lifetime semantics (the original IngestExternalFileArg goes out of scope after PrepareFileIngestion). This is correct. However, it introduces a copy for every job construction, including the existing single-call IngestExternalFiles path.
  • Suggested fix: This is acceptable — IngestExternalFileOptions is small and this is not a hot path. No action needed, noting for completeness.

🟢 LOW / NIT

L1. operator==(const IngestExternalFileOptions&) const = defaultoptions.h:2962
  • Issue: Using = default generates memberwise comparison. This is correct and future-safe (new fields are automatically included). However, it requires C++20. Verify the project's minimum C++ standard.
  • Suggested fix: Verify C++20 is the minimum standard. If not, implement explicit operator==.
L2. Naming inconsistency: PrepareFileIngestion vs CommitFileIngestionHandle(s)
  • Issue: "Prepare" uses "Ingestion" (noun) while "Commit" uses "Handle" (noun). Consider PrepareFileIngestion / CommitFileIngestion or PrepareIngestionHandle / CommitIngestionHandle for consistency.
  • Suggested fix: Minor naming nit — consider for API polish.
L3. Missing test for handle destructor rollback
  • Issue: The explicit Abort() path is tested, but implicit destructor rollback (handle goes out of scope without commit or abort) is only covered in stress tests, not in deterministic unit tests.
  • Suggested fix: Add a unit test that lets a handle go out of scope and verifies files are cleaned up.
L4. reserve added for ingestion_jobsdb_impl.cc:6884
  • Issue: Good addition. The old code was missing this reservation.

Cross-Component Analysis

Context Executes? Assumptions hold? Action needed?
DBImplReadOnly YES (inherits DBImpl) NO — should not allow ingestion Add NotSupported overrides (H1)
DBImplSecondary YES (inherits DBImpl) NO — should not allow ingestion Add NotSupported overrides (H1)
WritePreparedTxnDB YES (via StackableDB) YES — passthrough is correct Safe
WriteUnpreparedTxnDB YES (via StackableDB) YES — passthrough is correct Safe
BlobDB YES (via StackableDB) YES — passthrough is correct Safe
Crash between Prepare and Commit N/A Files on disk but not in MANIFEST Safe — recovery treats as obsolete
CF dropped between Prepare and Commit YES Commit checks IsDropped() Safe — returns error

Positive Observations

  • The parameterized test approach (two_phase_ingest_ as 3rd bool param) is excellent — it runs all existing ingestion tests through the two-phase path automatically, providing comprehensive coverage.
  • The stress test integration with 25% rollback probability is well-designed for catching edge cases.
  • The ingestion_options_ change from reference to value correctly addresses the lifetime issue introduced by the two-phase design.
  • The MergeForSameColumnFamily design with ComputeFilesOverlap + DivideInputFilesIntoBatches reuse is clean.
  • Good use of RAII pattern for handle cleanup.

ℹ️ About this response

Generated by Claude Code.
Review methodology: claude_md/code_review.md

Limitations:

  • Claude may miss context from files not in the diff
  • Large PRs may be truncated
  • Always apply human judgment to AI suggestions

Commands:

  • /claude-review [context] — Request a code review
  • /claude-query <question> — Ask about the PR or codebase

joshkang97 added a commit to joshkang97/rocksdb that referenced this pull request Jun 12, 2026
)

Summary:

This change splits ingestion into two public calls: `PrepareFileIngestion()` performs all of the off-mutex work and returns an opaque `FileIngestionHandle`, and `CommitFileIngestionHandle()` makes the prepared files visible under the mutex.

Even though the "Prepare" phase was off the DB mutex, the application layer may still have application logic that depends on the completion of `IngestExternalFile`.

`IngestExternalFiles(args)` is also just `PrepareFileIngestion(args)` followed by `CommitFileIngestionHandle()`, so its behavior is unchanged.

`CommitFileIngestionHandles()` also supports committing multiple handles atomically. However, it is possible that a CF may be present in multiple handles. In this case, we merge the ingestion jobs together. This allows applications to prepare file ingestions at different times, but still provide a single atomic commit.

Differential Revision: D108225105
@github-actions

github-actions Bot commented Jun 12, 2026

Copy link
Copy Markdown

Codex Code Review - OBSOLETE

Superseded by a newer AI review. Expand to see the original review.

🟡 Codex Code Review

Auto-triggered after CI passed — reviewing commit be3849f


Codex review failed before producing findings.

WARNING: proceeding, even though we could not create PATH aliases: Refusing to create helper binaries under temporary dir "/tmp" (codex_home: AbsolutePathBuf("/tmp/codex-home"))
error: the argument '--base <BRANCH>' cannot be used with '[PROMPT]'

Usage: codex exec review --commit <SHA> --base <BRANCH> --title <TITLE> --model <MODEL> --config <key=value> --dangerously-bypass-approvals-and-sandbox --output-last-message <FILE> [PROMPT]

For more information, try '--help'.

ℹ️ About this response

Generated by Codex CLI.
Review methodology: claude_md/code_review.md

Limitations:

  • Codex may miss context from files not in the diff
  • Large PRs may be truncated
  • Always apply human judgment to AI suggestions

Commands:

  • /codex-review [context] — Request a code review
  • /codex-query <question> — Ask about the PR or codebase

@github-actions

github-actions Bot commented Jun 12, 2026

Copy link
Copy Markdown

Claude Code Review - OBSOLETE

Superseded by a newer AI review. Expand to see the original review.

✅ Claude Code Review

Auto-triggered after CI passed — reviewing commit be3849f


Summary

Solid architectural split of IngestExternalFiles into prepare/commit phases with good test coverage and stress test integration. The core design is sound and the implementation follows existing RocksDB patterns well. Several issues need attention, primarily around resource lifetime safety and a missing validation gap.

High-severity findings (3):

  • [db/db_impl/db_impl.cc:6735] CommitFileIngestionHandles does not validate fill_cache consistency across handles from different PrepareFileIngestion calls, allowing a silent behavioral divergence from the old API's strict validation.
  • [db/db_impl/db_impl.cc:6745] Cross-DB handle check uses assert(h->db_ == this) instead of a runtime error, causing undefined behavior in release builds if handles from a different DB instance are passed.
  • [db/external_sst_file_ingestion_job.h:225] ExternalSstFileIngestionJob stores a raw ColumnFamilyData* without calling cfd->Ref(). In the old synchronous API, the caller's ColumnFamilyHandle kept the CFD alive. With the split API, if the user destroys the ColumnFamilyHandle and drops the CF between Prepare and Commit, the CFD could be freed, causing use-after-free.
Full review (click to expand)

Findings

🔴 HIGH

H1. Missing fill_cache consistency validation across handles -- db/db_impl/db_impl.cc
  • Issue: The old IngestExternalFiles validated that all args have the same fill_cache value (line 6756-6759). In CommitFileIngestionHandles, read_options.fill_cache is set from hs[0]->fill_cache_ without checking that all handles agree. Two handles prepared separately with different fill_cache values for different CFs would silently use the first handle's value.
  • Root cause: The validation was in PrepareFileIngestion (within a single call), but when committing handles from different Prepare calls, no cross-handle validation exists.
  • Suggested fix: Add validation in CommitFileIngestionHandles that all handles have the same fill_cache_ value, or document that this is intentionally relaxed.
H2. Cross-DB handle validation uses assert, not runtime check -- db/db_impl/db_impl.cc
  • Issue: assert(h->db_ == this) in CommitFileIngestionHandles only fires in debug builds. Passing handles from a different DB instance in release builds causes undefined behavior.
  • Root cause: Defensive check was implemented as assert rather than runtime error.
  • Suggested fix: Replace with if (h->db_ != this) return Status::InvalidArgument("handle belongs to a different DB instance");
H3. ColumnFamilyData* not reference-counted in ExternalSstFileIngestionJob -- db/external_sst_file_ingestion_job.h:225
  • Issue: The job stores a raw ColumnFamilyData* cfd_ without calling cfd->Ref(). In the old synchronous API, the caller's ColumnFamilyHandle naturally kept the CFD alive for the duration of the call. With the split API, the user could destroy their ColumnFamilyHandle and call DropColumnFamily between Prepare and Commit. If no other references exist, ColumnFamilyData::UnrefAndTryDelete() (column_family.cc:801) would free the CFD, leaving cfd_ dangling.
  • Root cause: The old code implicitly relied on the caller keeping the ColumnFamilyHandle alive during the synchronous operation. The split API breaks this assumption.
  • Suggested fix: Either (a) call cfd_->Ref() in the job constructor and cfd_->UnrefAndTryDelete() in the destructor, or (b) have the FileIngestionHandleImpl hold the ColumnFamilyHandle* references to keep CFDs alive, or (c) document that the user must keep the ColumnFamilyHandle alive until commit/abort.

🟡 MEDIUM

M1. Histogram recording behavioral change -- db/db_impl/db_impl.cc
  • Issue: The old code recorded both INGEST_EXTERNAL_FILE_PREPARE_TIME and INGEST_EXTERNAL_FILE_RUN_TIME only on full success. The new code records prepare timing at the end of PrepareFileIngestion (on prepare success), independent of commit outcome. If prepare succeeds but commit fails, prepare timing is now recorded where it previously was not.
  • Suggested fix: This is arguably more accurate (the prepare did succeed), but if exact behavioral equivalence is desired for the IngestExternalFiles wrapper, defer prepare timing recording to commit success.
M2. PERF_TIMER_GUARD(file_ingestion_nanos) missing from two-phase API -- db/db_impl/db_impl.cc
  • Issue: The old IngestExternalFiles wrapper has PERF_TIMER_GUARD(file_ingestion_nanos) which measures the entire operation. Direct calls to PrepareFileIngestion + CommitFileIngestionHandle do not record file_ingestion_nanos. Users of the two-phase API lose this metric.
  • Suggested fix: Document this gap, or add the timer to both Prepare and Commit so total time can be reconstructed.
M3. API naming inconsistency -- include/rocksdb/db.h
  • Issue: Existing APIs use IngestExternalFile / IngestExternalFiles naming. New APIs use FileIngestion / FileIngestionHandle. Consider PrepareIngestExternalFiles / CommitIngestExternalFileHandle for consistency.
  • Suggested fix: Consider renaming for consistency, or document the rationale for the different naming.

🟢 LOW / NIT

L1. consumed_ thread safety -- db/db_impl/db_impl.cc
  • Issue: consumed_ is a non-atomic bool, so calling Abort() from one thread while the destructor fires on another would be a data race. However, the API contract states handles must not be used concurrently, and the FileIngestionHandle is non-copyable. This is a user contract violation similar to calling methods on a destroyed object.
  • Suggested fix: Making it std::atomic<bool> is cheap insurance against misuse, but not strictly required if the single-threaded-use contract is documented.
L2. CommitFileIngestionHandle (singular) is virtual -- include/rocksdb/db.h
  • Issue: The convenience method has a default implementation that calls CommitFileIngestionHandles. Making it virtual allows subclasses to override it independently, potentially creating inconsistency. The old IngestExternalFile -> IngestExternalFiles delegation is non-virtual.
  • Suggested fix: Consider making it non-virtual, or document that overrides must be consistent with the plural version.
L3. CloseHelper assert for outstanding handles -- db/db_impl/db_impl.cc:835
  • Issue: assert(num_outstanding_prepared_ingestions_.load() == 0) only fires in debug builds. In release builds, if a user violates the "handle must not outlive DB" contract, the handle's destructor accesses the destroyed DB. This matches how RocksDB handles similar contracts (e.g., iterators, snapshots), so it follows existing patterns.
  • Suggested fix: No change needed if following existing patterns. Optionally, log a warning or return early in the destructor if a "DB shutting down" flag is set.

Cross-Component Analysis

Context Safe? Notes
WritePreparedTxnDB Yes Inherits via StackableDB forwarding; no override of ingestion
ReadOnly DB Yes Returns NotSupported
Secondary Instance Yes Returns NotSupported
BlobDB Yes Does not override IngestExternalFiles
Concurrent ingestions Yes Same locking as before (per-CF ingest lock + DB mutex)
Compactions between Prepare/Commit Yes Level assignment happens during Run() under mutex
mutable_db_options_ changes Yes Job holds a reference to the member variable; reads current value at Run() time
User-defined timestamps Yes Same validation as before
atomic_replace_range + merge Yes MergeForSameColumnFamily rejects this combination

Positive Observations

  • The ingestion_options_ change from const& to value is correct and essential for the split lifetime.
  • ComputeFilesOverlap extraction is a good refactoring that makes the overlap check reusable for merge.
  • The ingestion_jobs.reserve(num_cfs) addition is a nice optimization.
  • The MergeForSameColumnFamily design correctly recomputes overlap and rebuilds batches.
  • Sequence number ordering for merged jobs is correct: appended files get later batches and higher seqnos.
  • Stress test coverage with both abort paths (explicit Abort + RAII destructor) is thorough.
  • The operator== on IngestExternalFileOptions using C++20 defaulted comparison is correct (all fields are trivially comparable) and forward-compatible.
  • Test parameterization to run all existing tests with two_phase_ingest_=true provides excellent regression coverage.

ℹ️ About this response

Generated by Claude Code.
Review methodology: claude_md/code_review.md

Limitations:

  • Claude may miss context from files not in the diff
  • Large PRs may be truncated
  • Always apply human judgment to AI suggestions

Commands:

  • /claude-review [context] — Request a code review
  • /claude-query <question> — Ask about the PR or codebase

joshkang97 added a commit to joshkang97/rocksdb that referenced this pull request Jun 12, 2026
)

Summary:

This change splits ingestion into two public calls: `PrepareFileIngestion()` performs all of the off-mutex work and returns an opaque `FileIngestionHandle`, and `CommitFileIngestionHandle()` makes the prepared files visible under the mutex.

Even though the "Prepare" phase was off the DB mutex, the application layer may still have application logic that depends on the completion of `IngestExternalFile`.

`IngestExternalFiles(args)` is also just `PrepareFileIngestion(args)` followed by `CommitFileIngestionHandle()`, so its behavior is unchanged.

`CommitFileIngestionHandles()` also supports committing multiple handles atomically. However, it is possible that a CF may be present in multiple handles. In this case, we merge the ingestion jobs together. This allows applications to prepare file ingestions at different times, but still provide a single atomic commit.

Differential Revision: D108225105
@github-actions

github-actions Bot commented Jun 12, 2026

Copy link
Copy Markdown

Codex Code Review - OBSOLETE

Superseded by a newer AI review. Expand to see the original review.

🟡 Codex Code Review

Auto-triggered after CI passed — reviewing commit e182491


Codex review failed before producing findings.

WARNING: proceeding, even though we could not create PATH aliases: Refusing to create helper binaries under temporary dir "/tmp" (codex_home: AbsolutePathBuf("/tmp/codex-home"))
error: the argument '--base <BRANCH>' cannot be used with '[PROMPT]'

Usage: codex exec review --commit <SHA> --base <BRANCH> --title <TITLE> --model <MODEL> --config <key=value> --dangerously-bypass-approvals-and-sandbox --output-last-message <FILE> [PROMPT]

For more information, try '--help'.

ℹ️ About this response

Generated by Codex CLI.
Review methodology: claude_md/code_review.md

Limitations:

  • Codex may miss context from files not in the diff
  • Large PRs may be truncated
  • Always apply human judgment to AI suggestions

Commands:

  • /codex-review [context] — Request a code review
  • /codex-query <question> — Ask about the PR or codebase

@github-actions

github-actions Bot commented Jun 12, 2026

Copy link
Copy Markdown

Claude Code Review - OBSOLETE

Superseded by a newer AI review. Expand to see the original review.

✅ Claude Code Review

Auto-triggered after CI passed — reviewing commit e182491


Summary

Solid design that correctly splits the heavy off-mutex work from the brief on-mutex commit. The refactoring preserves existing IngestExternalFiles() behavior by composing it from the new APIs. However, there are several issues ranging from a potential dangling-reference-on-mutation bug to missing subclass overrides and minor typos.

High-severity findings (3):

  • [external_sst_file_ingestion_job.h:405] MutableDBOptions& stored by reference in ExternalSstFileIngestionJob can be mutated between Prepare and Commit via SetDBOptions().
  • [compacted_db_impl.h] CompactedDBImpl does not override PrepareFileIngestion / CommitFileIngestionHandles, allowing ingestion on a compacted-mode DB that intends to block it.
  • [db_impl.cc] num_running_ingest_file_ semantic change from CF-count to merged-job-count after MergeForSameColumnFamily — increment/decrement are self-consistent but the semantic shift should be documented.
Full review (click to expand)

Findings

🔴 HIGH

H1. MutableDBOptions& reference may observe changed values between Prepare and Commit — external_sst_file_ingestion_job.h:405
  • Issue: ExternalSstFileIngestionJob stores const MutableDBOptions& mutable_db_options_ (a reference to DBImpl::mutable_db_options_). In the old single-call IngestExternalFiles(), the reference was consumed within one function scope. With the two-phase API, the job persists in a FileIngestionHandle across arbitrary time. If SetDBOptions() is called between Prepare and Commit, the job silently picks up the new values.
  • Root cause: ingestion_options_ was correctly changed from const& to by-value; mutable_db_options_ was not.
  • Suggested fix: Store MutableDBOptions by value, or capture a snapshot at Prepare time.
H2. CompactedDBImpl does not override new APIs — compacted_db_impl.h
  • Issue: CompactedDBImpl overrides IngestExternalFile() to return NotSupported, but does not override PrepareFileIngestion() or CommitFileIngestionHandles(). Since it inherits from DBImpl, the new methods will attempt real ingestion.
  • Suggested fix: Add NotSupported overrides.
H3. num_running_ingest_file_ semantic change after merge — db_impl.cc
  • Issue: Counter now reflects unique-after-merge job count rather than original CF count. Increment/decrement are self-consistent, but the semantic shift from the old code could affect external assumptions.
  • Suggested fix: Document the updated semantic with a comment.

🟡 MEDIUM

M1. Typo: "accross" → "across" in error message
M2. Statistics asymmetry: PREPARE_TIME recorded on prepare success even if commit fails
M3. PERF_TIMER_GUARD(file_ingestion_nanos) not tracked for direct two-phase API users
M4. Handle lifetime vs DB lifetime enforced only by debug assert
M5. No test for MergeForSameColumnFamily rejecting atomic_replace_range

🟢 LOW / NIT

L1. Release note: "its committed" → "it's committed"
L2. fill_cache_ validation deferred to commit time (by design, but worth documenting)

Cross-Component Analysis

Context Overridden? Action needed?
ReadOnly DB Yes → NotSupported OK
SecondaryInstance Yes → NotSupported OK
CompactedDB No — inherits DBImpl Add NotSupported overrides
StackableDB Forwarded OK
TransactionDB Inherits base OK

Positive Observations

  • Clean refactoring: IngestExternalFiles is now Prepare+Commit, ensuring all existing tests exercise the new path.
  • MergeForSameColumnFamily correctly rebuilds batches after appending files.
  • consumed_ flag + destructor rollback is a good safety net.
  • ingestion_options_ correctly changed from reference to value.
  • Thorough stress test and parametric test coverage.
  • ComputeFilesOverlap extraction improves reuse.

The full review is also saved to review-findings.md.


ℹ️ About this response

Generated by Claude Code.
Review methodology: claude_md/code_review.md

Limitations:

  • Claude may miss context from files not in the diff
  • Large PRs may be truncated
  • Always apply human judgment to AI suggestions

Commands:

  • /claude-review [context] — Request a code review
  • /claude-query <question> — Ask about the PR or codebase

)

Summary:

This change splits ingestion into two public calls: `PrepareFileIngestion()` performs all of the off-mutex work and returns an opaque `FileIngestionHandle`, and `CommitFileIngestionHandle()` makes the prepared files visible under the mutex.

Even though the "Prepare" phase was off the DB mutex, the application layer may still have application logic that depends on the completion of `IngestExternalFile`.

`IngestExternalFiles(args)` is also just `PrepareFileIngestion(args)` followed by `CommitFileIngestionHandle()`, so its behavior is unchanged.

`CommitFileIngestionHandles()` also supports committing multiple handles atomically. However, it is possible that a CF may be present in multiple handles. In this case, we merge the ingestion jobs together. This allows applications to prepare file ingestions at different times, but still provide a single atomic commit.

Differential Revision: D108225105
@github-actions

Copy link
Copy Markdown

🟡 Codex Code Review

Auto-triggered after CI passed — reviewing commit be2931b


Codex review failed before producing findings.

WARNING: proceeding, even though we could not create PATH aliases: Refusing to create helper binaries under temporary dir "/tmp" (codex_home: AbsolutePathBuf("/tmp/codex-home"))
error: the argument '--base <BRANCH>' cannot be used with '[PROMPT]'

Usage: codex exec review --commit <SHA> --base <BRANCH> --title <TITLE> --model <MODEL> --config <key=value> --dangerously-bypass-approvals-and-sandbox --output-last-message <FILE> [PROMPT]

For more information, try '--help'.

ℹ️ About this response

Generated by Codex CLI.
Review methodology: claude_md/code_review.md

Limitations:

  • Codex may miss context from files not in the diff
  • Large PRs may be truncated
  • Always apply human judgment to AI suggestions

Commands:

  • /codex-review [context] — Request a code review
  • /codex-query <question> — Ask about the PR or codebase

@github-actions

Copy link
Copy Markdown

✅ Claude Code Review

Auto-triggered after CI passed — reviewing commit be2931b


Summary

Solid two-phase split of IngestExternalFile with clean handle-based ownership model. The core design is sound and consistent with RocksDB patterns. The implementation correctly handles the main success and failure paths, but has several issues that warrant attention before merge.

High-severity findings (2):

  • [db_impl.cc:CommitFileIngestionHandles] fill_cache consistency check uses only the first handle's value, but same-CF merge accepts handles with different fill_cache through the ingestion_options_ equality check -- inconsistent validation layers.
  • [external_sst_file_ingestion_job.h:405] mutable_db_options_ and env_options_ remain stored by reference in ExternalSstFileIngestionJob; while technically safe today (reads under mutex), this is fragile as the job now lives across a longer time window between Prepare and Commit.

Medium-severity findings (5):

  • No runtime enforcement that handles don't outlive DB (debug assert only)
  • CF drop between Prepare/Commit could leave dangling ColumnFamilyData* (no Ref taken)
  • operator== = default on IngestExternalFileOptions is a maintenance trap
  • Missing PERF_TIMER_GUARD for direct two-phase callers
  • Stress test doesn't exercise multi-CF multi-handle commits

Low-severity findings (5): Good reserve() addition, seq_cst overkill for counter, missing tests for CF-drop and concurrent-writes edge cases.

Full details with root causes and suggested fixes are in review-findings.md.


ℹ️ About this response

Generated by Claude Code.
Review methodology: claude_md/code_review.md

Limitations:

  • Claude may miss context from files not in the diff
  • Large PRs may be truncated
  • Always apply human judgment to AI suggestions

Commands:

  • /claude-review [context] — Request a code review
  • /claude-query <question> — Ask about the PR or codebase

@meta-codesync meta-codesync Bot closed this in 4d30cfd Jun 13, 2026
@meta-codesync

meta-codesync Bot commented Jun 13, 2026

Copy link
Copy Markdown

This pull request has been merged in 4d30cfd.

@meta-codesync meta-codesync Bot added the Merged label Jun 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant