Skip to content

Commit d3c3bfe

Browse files
committed
Add built-in operator statistics providers
Adds a set of default StatisticsProvider implementations that cover the most common physical operators: - FilterStatisticsProvider: selectivity-based row count, reuses the same pre-enhanced child statistics, with post-filter NDV adjustment - ProjectionStatisticsProvider: column mapping through projections - PassthroughStatisticsProvider: passthrough for cardinality-preserving operators (Sort, Repartition, Window, etc.) via CardinalityEffect - AggregateStatisticsProvider: NDV-product estimation for GROUP BY, delegates for Partial mode and multiple grouping sets (apache#20926) - JoinStatisticsProvider: NDV-based join output estimation (hash, sort-merge, nested-loop, cross) with join-type-aware cardinality bounds and correct key-column NDV lookup - LimitStatisticsProvider: caps output at the fetch limit (local and global) - UnionStatisticsProvider: sums input row counts - DefaultStatisticsProvider: fallback to partition_statistics(None)
1 parent 84d861a commit d3c3bfe

File tree

3 files changed

+1527
-45
lines changed

3 files changed

+1527
-45
lines changed

datafusion/physical-plan/src/filter.rs

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -309,7 +309,7 @@ impl FilterExec {
309309
}
310310

311311
/// Calculates `Statistics` for `FilterExec`, by applying selectivity (either default, or estimated) to input statistics.
312-
fn statistics_helper(
312+
pub(crate) fn statistics_helper(
313313
schema: &SchemaRef,
314314
input_stats: Statistics,
315315
predicate: &Arc<dyn PhysicalExpr>,

datafusion/physical-plan/src/lib.rs

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -80,14 +80,14 @@ pub mod joins;
8080
pub mod limit;
8181
pub mod memory;
8282
pub mod metrics;
83+
pub mod operator_statistics;
8384
pub mod placeholder_row;
8485
pub mod projection;
8586
pub mod recursive_query;
8687
pub mod repartition;
8788
pub mod sort_pushdown;
8889
pub mod sorts;
8990
pub mod spill;
90-
pub mod operator_statistics;
9191
pub mod stream;
9292
pub mod streaming;
9393
pub mod tree_node;

0 commit comments

Comments
 (0)