Skip to content

Performance improvements#10

Merged
jgarzik merged 13 commits intomainfrom
updates
Feb 11, 2026
Merged

Performance improvements#10
jgarzik merged 13 commits intomainfrom
updates

Conversation

@jgarzik
Copy link
Copy Markdown
Owner

@jgarzik jgarzik commented Feb 11, 2026

No description provided.

jgarzik and others added 10 commits February 10, 2026 18:36
Implement MySQL binary protocol prepared statements (COM_STMT_PREPARE,
COM_STMT_EXECUTE, COM_STMT_CLOSE, COM_STMT_RESET) using a parse-once,
substitute-per-execute design. Parameters are substituted at the AST
level before entering the resolve/typecheck/plan/execute pipeline.

Fix AUTO_INCREMENT: parse DialectSpecific tokens, skip NOT NULL checks
for auto_increment columns, and generate sequential IDs during INSERT.

New files:
- src/protocol/roodb/binary.rs: binary result set encoding
- tests/roodb_suite/prepared/: 14 integration tests covering SELECT,
  INSERT, UPDATE, DELETE, re-execute, NULL params, multiple params,
  type coverage, sysbench patterns, and auto_increment

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Fix three bugs blocking sysbench workloads:

- COM_STMT_EXECUTE: remember parameter types across executions
  (new_params_bound_flag=0 means reuse previous types, not default
  to VarString)
- Prepared statement path: handle BEGIN/COMMIT/ROLLBACK/SET statements
  that sysbench sends as prepared stmts instead of text protocol
- SSTable: support multi-page index blocks when many data blocks cause
  index entries to exceed a single 4KB page

Additional protocol fixes for C libmysqlclient compatibility:
- Send actual column definitions in COM_STMT_PREPARE_OK response
- Advertise CLIENT_DEPRECATE_EOF capability
- Add catch-all SET handler for SET NAMES, SET charset, etc.

Add bench/ directory with sysbench benchmark scripts (run.sh,
compare.sh) comparing RooDB vs MySQL 8.0 across oltp_point_select,
oltp_read_only, oltp_read_write, and oltp_write_only workloads.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…S improvement

Three performance improvements to close the gap with MySQL on sysbench:

1. Block Cache (LRU, 64MB default): Caches SSTable data/index blocks to
   eliminate redundant disk I/O for hot data. Integrated into SstableReader
   and LsmEngine with compaction-based invalidation.

2. Prepared Statement Plan Cache: Caches PhysicalPlan after first execution
   using Literal::Placeholder(n) for parameter positions. Subsequent executions
   clone and substitute params directly, skipping resolve/typecheck/optimize/plan.
   Schema version tracking on Catalog invalidates cache on DDL changes.

3. PointGet Executor Infrastructure: PhysicalPlan::PointGet variant, executor,
   explain/cost support. Optimizer detection disabled because storage keys are
   auto-generated row IDs (not PK values); needs PK→row_key index to activate.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…S improvement

Performance optimizations achieving 1,430x improvement on point_select:

- Buffered I/O: BufWriter/BufReader on TLS streams
- Memtable threshold: 64MB (was 4MB)
- PK-as-storage-key: order-preserving key encoding, PointGet O(1) lookups
- Streaming TableScan: decode one row at a time in next()
- Removed per-commit flush: Raft-as-WAL provides durability
- SSTable bloom filters: 10 bits/key, 3 hashes, skip SSTables on miss
- Two-pass OCC in Raft apply: prevents false conflicts within changesets
- L0 SSTable ordering: newest-first to avoid stale reads

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…d batched block reads

Eliminates per-scan I/O overhead by caching open SSTable readers with
write-through on flush/compaction and read-allocate on miss. Bloom filters
are now cached in the block cache. New SSTables are eagerly loaded into
both caches on creation. Also adds RangeScan physical plan node for
PK-bounded scans and batched block reads for scan_range.

Benchmark (oltp_read_write): 4.34 -> 5.99 TPS (+38%)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…MVCC, allocation reduction

RooDB now beats MySQL 8.0 on oltp_read_only (935 TPS vs 612 TPS).

Key changes:
- Enable TCP_NODELAY on client connections (fixes 40ms Nagle delays — dominant fix)
- Replace tokio Mutex with parking_lot RwLock for manifest, snapshot-then-release pattern
- Zero-copy MVCC filtering via drain() instead of to_vec() for visible rows
- Per-connection MvccStorage to eliminate 16 Arc::new per transaction
- ReadView fast path for empty active transaction sets (skip HashSet)
- Zero-copy SSTable scan via into_entries() instead of clone per entry
- Per-query timing instrumentation (RUST_LOG=roodb::perf=debug)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…lect, skip vote flush, batch apply

Single-node mode: disable tick loop, heartbeat, and election (pure event-driven),
increase purge_batch_size to 1000, skip flush_critical in save_vote, persist
LAST_APPLIED_KEY once per batch instead of per-entry, reduce startup sleep
from 500ms to 10ms. Add perf instrumentation to all Raft storage operations.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Eliminates full table scans for UPDATE/DELETE with primary key equality
filters. Previously only SELECT used PointGet; UPDATE/DELETE did O(n)
scans even for single-row PK lookups, causing 90x regression at 100K rows.

- Add MvccStorage::get_with_version() for single-key OCC lookup
- Add key_value field to PhysicalPlan::Update/Delete variants
- Physical planner extracts PointGet for DML via extract_point_get()
- Update/Delete executors: O(1) path with write buffer + MVCC lookup
- substitute_params handles key_value for prepared statement cache

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The UPDATE/DELETE PointGet fast path incorrectly checked the write buffer
and used creator_txn_id as the OCC version. This caused false write
conflicts when two DML statements in the same transaction touched the
same row. The full-scan path reads from MVCC storage (not the buffer),
so the PointGet path should match that behavior.

Also adds benchmarking policy to CLAUDE.md: always run all workloads.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Previously DROP TABLE only removed catalog entries (system.tables,
system.columns, system.constraints) but left user data rows (t:{table}:r:*)
orphaned in storage. This caused data accumulation across CREATE/DROP cycles
and benchmark suite hangs with large tables.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@jgarzik jgarzik requested a review from Copilot February 11, 2026 06:28
@jgarzik jgarzik self-assigned this Feb 11, 2026
@jgarzik jgarzik added bug Something isn't working enhancement New feature or request labels Feb 11, 2026
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a broad set of performance-focused improvements across RooDB’s storage engine and query execution pipeline, and adds end-to-end prepared statement support (COM_STMT_PREPARE / COM_STMT_EXECUTE) to better match MySQL client behavior and sysbench patterns.

Changes:

  • Add prepared statement support with plan caching, binary result-set encoding, and placeholder-aware resolve/typecheck/planning.
  • Improve LSM read performance via block cache + bloom filters, batched block reads, reader caching, and reduced lock contention.
  • Add PK-based fast paths (PointGet/RangeScan) plus INSERT auto_increment handling and related planner/executor updates.

Reviewed changes

Copilot reviewed 54 out of 56 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
tests/roodb_suite/prepared/mod.rs Adds prepared-statement test module wiring.
tests/roodb_suite/prepared/basic.rs New integration tests for prepared SELECT/INSERT/UPDATE/DELETE and sysbench-like patterns.
tests/roodb_suite/prepared/auto_increment.rs Adds AUTO_INCREMENT integration tests (including prepared INSERT).
tests/roodb_suite/mod.rs Exposes the new prepared test suite.
tests/roodb_suite/cluster/replication.rs Updates RaftNode constructor calls with single-node flag parameter.
tests/raft_cluster.rs Updates RaftNode constructor calls with single-node flag parameter.
tests/planner_tests.rs Asserts planner extracts PK key_value for UPDATE/DELETE point-get fast paths.
tests/executor_tests.rs Updates tests for PK-key encoding and new plan fields (pk indices / key_value).
src/txn/read_view.rs Adds all_committed fast-path to MVCC visibility checks.
src/txn/mvcc_storage.rs Avoids copies in visibility traversal; adds get_with_version() for OCC.
src/txn/manager.rs Removes per-commit storage flush; optimizes read-view creation allocation behavior.
src/storage/lsm/sstable.rs SSTable v2: multi-page index, bloom filter in footer, optional block cache, batched scans, eager load.
src/storage/lsm/mod.rs Exposes new block_cache and bloom modules.
src/storage/lsm/memtable.rs Raises memtable flush threshold to 64MB.
src/storage/lsm/engine.rs Adds block cache + reader cache; reduces locking during scans; introduces batched SSTable scan path.
src/storage/lsm/bloom.rs Implements Bloom filter + tests.
src/storage/lsm/block_cache.rs Adds LRU block cache implementation + tests.
src/storage/lsm/block.rs Adds into_entries() to support zero-copy-ish scan paths.
src/sql/typecheck.rs Treats placeholders as type-compatible to enable prepared-plan templates.
src/sql/resolver.rs Adds placeholder mode and AUTO_INCREMENT parsing + insert nullability handling.
src/server/listener.rs Removes TransactionManager storage-flush dependency; tweaks test server leader wait.
src/server/handler.rs Disables Nagle’s algorithm (TCP_NODELAY) for lower latency.
src/raft/node.rs Adds single-node Raft config mode; checks state-machine response for application errors; adds perf logs.
src/raft/lsm_storage.rs Adds single-node mode flush skipping for votes; adds perf logs; refactors OCC apply into two-pass.
src/protocol/roodb/prepared.rs Implements prepared statement manager, placeholder counting, parameter decoding, and substitution helpers.
src/protocol/roodb/mod.rs Integrates prepared statements into command loop; adds binary resultsets; adds plan cache usage; buffers IO; adds perf timing.
src/protocol/roodb/metrics.rs Adds lightweight per-query timing logging utilities.
src/protocol/roodb/handshake.rs Advertises CLIENT_DEPRECATE_EOF capability.
src/protocol/roodb/command.rs Parses COM_STMT_EXECUTE raw payload; adds COM_STMT_RESET support.
src/protocol/roodb/binary.rs Adds binary row encoding for prepared-statement result sets.
src/planner/physical.rs Adds PointGet/RangeScan plans, prepared param substitution, PK bound extraction, and insert metadata (pk/auto-inc indices).
src/planner/logical/mod.rs Adds Literal::Placeholder.
src/planner/logical/builder.rs Propagates placeholder handling into type inference/naming.
src/planner/explain.rs Adds explain output for PointGet/RangeScan and updates Delete formatting.
src/planner/cost.rs Adds cost estimates for PointGet/RangeScan.
src/main.rs Switches to scheduled IO factory; updates RaftNode init signature.
src/executor/update.rs Adds UPDATE PointGet fast path using OCC version reads.
src/executor/scan.rs Makes TableScan streaming by lazily decoding rows in next().
src/executor/range_scan.rs Adds executor to scan PK ranges with tight storage bounds + buffer merge.
src/executor/point_get.rs Adds executor for single-row PK lookup.
src/executor/mod.rs Exposes new executors.
src/executor/insert.rs Adds auto_increment handling, last_insert_id reporting, and PK-based key encoding.
src/executor/engine.rs Wires new plans/executors and passes insert/update/delete metadata.
src/executor/encoding.rs Adds PK-based key encoding and comparable datum encoding.
src/executor/delete.rs Adds DELETE PointGet fast path using OCC version reads.
src/executor/ddl.rs Ensures DROP TABLE also deletes user data rows for the table.
src/executor/datum.rs Panics on unsubstituted placeholders to prevent executing unbound plan templates.
src/executor/context.rs Adds buffered range retrieval to support range scans / filtered buffer merges.
src/catalog/mod.rs Adds schema_version for prepared plan cache invalidation.
bench/run.sh Adds sysbench benchmark harness to compare RooDB vs MySQL and emit JSON results.
bench/results/.gitkeep Keeps benchmark results directory in git.
bench/compare.sh Adds JSON benchmark comparison utility.
bench/.gitignore Ignores generated benchmark JSON output.
Cargo.toml Adds lru dependency for block cache.
Cargo.lock Locks lru dependency versions (adds lru 0.12.5 alongside existing lru 0.14.0).
CLAUDE.md Documents benchmarking expectations and policies.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/storage/lsm/sstable.rs Outdated
Comment thread src/storage/lsm/engine.rs
Comment thread src/storage/lsm/engine.rs
Comment thread src/protocol/roodb/mod.rs Outdated
Comment thread src/protocol/roodb/mod.rs
Comment thread src/planner/physical.rs Outdated
Comment thread src/storage/lsm/sstable.rs Outdated
jgarzik and others added 3 commits February 11, 2026 06:55
…, double open

- sstable.rs: keep Arc<Vec<u8>> on index block cache hit instead of
  cloning the entire Vec
- engine.rs: on reader_cache miss in read_all() and scan(), fall back
  to opening the SSTable on demand instead of silently skipping it
- mod.rs: remove duplicate executor.open() in send_binary_result_set()
  (caller already opens before passing it in)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…oom comment

- physical.rs: substitute_params/substitute_expr now return PlannerResult;
  out-of-range placeholder index returns InvalidPlan error instead of
  silently substituting NULL
- mod.rs: propagate substitute_params errors with ?
- sstable.rs: fix misleading comment on bloom filter mem::swap pattern
- TODO.md: document SELECT * prepared statement limitation

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@jgarzik jgarzik merged commit 314b9fb into main Feb 11, 2026
10 checks passed
@jgarzik jgarzik deleted the updates branch February 11, 2026 07:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants