Commit e8d217a
authored
perf: use DynComparator in sort-merge join (SMJ), microbenchmark queries up to 12% faster, TPC-H overall ~5% faster (#21484)
## Which issue does this PR close?
Partially addresses #20910.
## Rationale for this change
Sort merge join comparisons (`compare_join_arrays`,
`is_join_arrays_equal`) do a `match DataType` + `downcast_ref` on every
call, per column. These are called per-row in hot join loops across SMJ,
semi/anti/mark SMJ, and piecewise merge join.
`arrow_ord::ord::make_comparator` does the type dispatch once at
construction and returns a `DynComparator` closure that goes straight to
typed value comparison. Arrow's own `LexicographicalComparator` uses
this pattern for sorting — we should use it for joins too.
## What changes are included in this PR?
Adds `JoinKeyComparator` to `joins/utils.rs`: a thin wrapper around
`Vec<DynComparator>` built once per batch pair. Null handling
(`NullEqualsNothing` both-null -> `Less` override) is baked into the
closures at construction time so `compare()` is a branchless loop.
Integrated into all hot-path call sites:
- `materializing_stream.rs`: `streamed_buffered_cmp` (streamed vs
buffered) and `buffered_equality_cmp` (head vs tail equality)
- `bitwise_stream.rs`: `outer_inner_cmp`, `outer_self_cmp`,
`inner_self_cmp`; simplified `find_key_group_end` signature (takes
`&JoinKeyComparator`, returns `usize` instead of `Result<usize>` since
type errors are now caught at construction)
- `piecewise_merge_join/classic_join.rs`: single comparator built per
batch pair
`compare_join_arrays` is kept for the one-off `keys_match` call (once
per batch boundary).
Deleted `is_join_arrays_equal` (75-line per-row type dispatch function),
replaced by `JoinKeyComparator::is_equal`.
## Are these changes tested?
- 4 unit tests for `JoinKeyComparator`: multi-column mixed types,
`NullEqualsNull`, `NullEqualsNothing`, `nulls_first` ordering
- Existing SMJ test suites pass
- Existing sqllogictest join tests pass
## Are there any user-facing changes?
No.1 parent 44af0a1 commit e8d217a
File tree
5 files changed
+393
-181
lines changed- datafusion
- core/tests/memory_limit
- physical-plan/src/joins
- piecewise_merge_join
- sort_merge_join
5 files changed
+393
-181
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
213 | 213 | | |
214 | 214 | | |
215 | 215 | | |
| 216 | + | |
216 | 217 | | |
217 | 218 | | |
218 | 219 | | |
| |||
Lines changed: 10 additions & 12 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
38 | 38 | | |
39 | 39 | | |
40 | 40 | | |
41 | | - | |
| 41 | + | |
42 | 42 | | |
43 | 43 | | |
44 | 44 | | |
| |||
460 | 460 | | |
461 | 461 | | |
462 | 462 | | |
| 463 | + | |
| 464 | + | |
| 465 | + | |
| 466 | + | |
| 467 | + | |
| 468 | + | |
| 469 | + | |
| 470 | + | |
463 | 471 | | |
464 | 472 | | |
465 | 473 | | |
| |||
475 | 483 | | |
476 | 484 | | |
477 | 485 | | |
478 | | - | |
479 | | - | |
480 | | - | |
481 | | - | |
482 | | - | |
483 | | - | |
484 | | - | |
485 | | - | |
486 | | - | |
487 | | - | |
488 | | - | |
| 486 | + | |
489 | 487 | | |
490 | 488 | | |
491 | 489 | | |
| |||
Lines changed: 94 additions & 78 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
126 | 126 | | |
127 | 127 | | |
128 | 128 | | |
129 | | - | |
| 129 | + | |
130 | 130 | | |
131 | 131 | | |
132 | 132 | | |
| |||
162 | 162 | | |
163 | 163 | | |
164 | 164 | | |
165 | | - | |
166 | | - | |
| 165 | + | |
| 166 | + | |
167 | 167 | | |
168 | 168 | | |
169 | 169 | | |
170 | | - | |
171 | | - | |
172 | | - | |
173 | | - | |
174 | | - | |
175 | | - | |
176 | | - | |
| 170 | + | |
177 | 171 | | |
178 | 172 | | |
179 | | - | |
| 173 | + | |
180 | 174 | | |
181 | 175 | | |
182 | 176 | | |
183 | | - | |
184 | | - | |
185 | | - | |
186 | | - | |
187 | | - | |
188 | | - | |
189 | | - | |
190 | | - | |
191 | | - | |
192 | | - | |
| 177 | + | |
| 178 | + | |
193 | 179 | | |
194 | 180 | | |
195 | 181 | | |
196 | 182 | | |
197 | | - | |
198 | | - | |
199 | | - | |
200 | | - | |
201 | | - | |
202 | | - | |
203 | | - | |
204 | | - | |
205 | | - | |
206 | | - | |
| 183 | + | |
| 184 | + | |
207 | 185 | | |
208 | 186 | | |
209 | 187 | | |
210 | 188 | | |
211 | 189 | | |
212 | 190 | | |
213 | 191 | | |
214 | | - | |
215 | | - | |
216 | | - | |
217 | | - | |
218 | | - | |
219 | | - | |
220 | | - | |
221 | | - | |
222 | | - | |
| 192 | + | |
223 | 193 | | |
224 | 194 | | |
225 | 195 | | |
226 | 196 | | |
227 | 197 | | |
228 | | - | |
| 198 | + | |
229 | 199 | | |
230 | 200 | | |
231 | 201 | | |
| |||
328 | 298 | | |
329 | 299 | | |
330 | 300 | | |
| 301 | + | |
| 302 | + | |
| 303 | + | |
| 304 | + | |
| 305 | + | |
| 306 | + | |
| 307 | + | |
| 308 | + | |
331 | 309 | | |
332 | 310 | | |
333 | 311 | | |
| |||
413 | 391 | | |
414 | 392 | | |
415 | 393 | | |
| 394 | + | |
| 395 | + | |
| 396 | + | |
416 | 397 | | |
417 | 398 | | |
418 | 399 | | |
| |||
425 | 406 | | |
426 | 407 | | |
427 | 408 | | |
| 409 | + | |
| 410 | + | |
| 411 | + | |
| 412 | + | |
| 413 | + | |
| 414 | + | |
| 415 | + | |
| 416 | + | |
| 417 | + | |
| 418 | + | |
| 419 | + | |
| 420 | + | |
| 421 | + | |
| 422 | + | |
| 423 | + | |
| 424 | + | |
| 425 | + | |
| 426 | + | |
| 427 | + | |
| 428 | + | |
| 429 | + | |
| 430 | + | |
| 431 | + | |
| 432 | + | |
| 433 | + | |
| 434 | + | |
| 435 | + | |
| 436 | + | |
| 437 | + | |
| 438 | + | |
| 439 | + | |
| 440 | + | |
| 441 | + | |
| 442 | + | |
| 443 | + | |
| 444 | + | |
| 445 | + | |
| 446 | + | |
| 447 | + | |
428 | 448 | | |
429 | 449 | | |
430 | 450 | | |
| |||
468 | 488 | | |
469 | 489 | | |
470 | 490 | | |
| 491 | + | |
| 492 | + | |
471 | 493 | | |
472 | 494 | | |
473 | 495 | | |
| |||
494 | 516 | | |
495 | 517 | | |
496 | 518 | | |
| 519 | + | |
| 520 | + | |
497 | 521 | | |
498 | 522 | | |
499 | 523 | | |
| |||
555 | 579 | | |
556 | 580 | | |
557 | 581 | | |
| 582 | + | |
558 | 583 | | |
559 | | - | |
| 584 | + | |
560 | 585 | | |
561 | 586 | | |
562 | | - | |
563 | | - | |
564 | | - | |
| 587 | + | |
565 | 588 | | |
566 | 589 | | |
567 | 590 | | |
| |||
584 | 607 | | |
585 | 608 | | |
586 | 609 | | |
| 610 | + | |
587 | 611 | | |
588 | | - | |
| 612 | + | |
589 | 613 | | |
590 | 614 | | |
591 | | - | |
592 | | - | |
593 | | - | |
| 615 | + | |
594 | 616 | | |
595 | 617 | | |
596 | 618 | | |
| |||
642 | 664 | | |
643 | 665 | | |
644 | 666 | | |
645 | | - | |
646 | | - | |
647 | | - | |
648 | | - | |
649 | | - | |
| 667 | + | |
| 668 | + | |
| 669 | + | |
| 670 | + | |
| 671 | + | |
650 | 672 | | |
651 | | - | |
| 673 | + | |
652 | 674 | | |
653 | 675 | | |
654 | | - | |
655 | | - | |
656 | | - | |
| 676 | + | |
657 | 677 | | |
658 | 678 | | |
| 679 | + | |
659 | 680 | | |
660 | 681 | | |
661 | 682 | | |
| |||
719 | 740 | | |
720 | 741 | | |
721 | 742 | | |
| 743 | + | |
722 | 744 | | |
723 | 745 | | |
724 | 746 | | |
| |||
738 | 760 | | |
739 | 761 | | |
740 | 762 | | |
741 | | - | |
| 763 | + | |
742 | 764 | | |
743 | 765 | | |
744 | | - | |
745 | | - | |
746 | | - | |
| 766 | + | |
747 | 767 | | |
748 | 768 | | |
749 | 769 | | |
| |||
959 | 979 | | |
960 | 980 | | |
961 | 981 | | |
962 | | - | |
963 | | - | |
964 | | - | |
965 | | - | |
966 | | - | |
967 | | - | |
968 | | - | |
969 | | - | |
| 982 | + | |
| 983 | + | |
| 984 | + | |
| 985 | + | |
| 986 | + | |
| 987 | + | |
970 | 988 | | |
971 | 989 | | |
972 | 990 | | |
| 991 | + | |
973 | 992 | | |
974 | | - | |
| 993 | + | |
975 | 994 | | |
976 | 995 | | |
977 | | - | |
978 | | - | |
979 | | - | |
| 996 | + | |
980 | 997 | | |
981 | 998 | | |
982 | 999 | | |
| 1000 | + | |
983 | 1001 | | |
984 | | - | |
| 1002 | + | |
985 | 1003 | | |
986 | 1004 | | |
987 | | - | |
988 | | - | |
989 | | - | |
| 1005 | + | |
990 | 1006 | | |
991 | 1007 | | |
992 | 1008 | | |
| |||
0 commit comments