Skip to content

fix: int64 precision loss in sort comparators and per-comparison branching#4090

Open
henryiii wants to merge 4 commits into
mainfrom
henryiii/fix-sort-comparators
Open

fix: int64 precision loss in sort comparators and per-comparison branching#4090
henryiii wants to merge 4 commits into
mainfrom
henryiii/fix-sort-comparators

Conversation

@henryiii

@henryiii henryiii commented Jun 10, 2026

Copy link
Copy Markdown
Member

🤖 AI text below 🤖

Summary

Three correctness and performance fixes in `awkward-cpp/src/cpu-kernels/`:

  • `awkward_sort.cpp` and `awkward_argsort.cpp` (lines 12-22): The `sort_order_ascending`/`sort_order_descending` and `argsort_order_ascending`/`argsort_order_descending` templates cast both operands to `double` and call `std::isnan` for every comparison, even for integer types. This is (a) a correctness bug for `int64`/`uint64` values beyond 2^53 — double rounding can compare distinct integers as equal, breaking strict-weak-ordering — and (b) two int→double casts plus `isnan` overhead per comparison. Fixed by adding `if constexpr (std::is_integral_v) { return l < r; }` (resp. `>`), preserving the NaN-aware path only for floating-point types.

  • Explicit `template<>` bool specializations (end of each file): Were declared after the implicit instantiations that use them, which is ill-formed per [temp.expl.spec] (no diagnostic required). Moved above the first use, with forward declarations for the primary templates.

  • `awkward_argsort.cpp` comparator lambda (lines 42-51): Branched on `ascending` inside every comparator call — O(n log n) evaluations. Restructured to hoist the `ascending`/`stable` dispatch outside the sort call, matching the existing structure of `awkward_sort.cpp`.

Note: `awkward-cpp` version bump will happen at release time per project convention.

Changes

  • `awkward-cpp/src/cpu-kernels/awkward_sort.cpp`
  • `awkward-cpp/src/cpu-kernels/awkward_argsort.cpp`
  • `tests/test_4090_sort_int64_precision.py` — 5 regression tests

Test plan

  • Regression tests in `tests/test_4090_sort_int64_precision.py` (5 passing)
    • `ak.sort` on `int64` values > 2^53 matches NumPy (ascending)
    • `ak.argsort` on `int64` values > 2^53 matches NumPy (ascending)
    • `ak.sort` on `uint64` values > 2^53 matches NumPy
    • Float NaN handling unchanged (NaNs sort last)
    • Bool sort correct (ascending)
  • `prek -a --quiet` passes (no lint issues)

AI assistance

This PR was generated with Claude Code (automated multi-agent), claude-sonnet-4-6. Per CONTRIBUTING.md, AI assistance is disclosed here.

🤖 Generated with Claude Code

henryiii added a commit that referenced this pull request Jun 10, 2026
Tests that ak.sort and ak.argsort on int64/uint64 values > 2^53 match
numpy exactly, verifying the double-cast precision bug is fixed.
Also covers NaN float handling (unchanged), bool sort, descending order,
stable sort, and multiple sublists.

Assisted-by: ClaudeCode:claude-sonnet-4-6
@ianna

ianna commented Jun 10, 2026

Copy link
Copy Markdown
Member

@henryiii — let’s discuss the merge order for kernels at the meeting tomorrow

@codecov

codecov Bot commented Jun 11, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 85.32%. Comparing base (712dac0) to head (60c6f74).
⚠️ Report is 1 commits behind head on main.
✅ All tests successful. No failed tests found.

Additional details and impacted files

@ianna ianna left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@henryiii - thanks for submitting it! This is addressed in #4056 and since the API has changed and as well as the kernels structure I would suggest to postpone merging this one. Thanks!

henryiii and others added 4 commits June 11, 2026 15:40
…ching

- Use `if constexpr (std::is_integral_v<T>)` in sort_order_ascending/
  descending and argsort_order_ascending/descending so integer types
  compare directly without double-casting (fixes precision loss for
  int64/uint64 values > 2^53, and eliminates two casts + isnan per
  comparison for integral types).
- Move explicit `template<>` bool specializations above the implicit
  instantiations that use them, fixing ill-formed translation units
  per [temp.expl.spec].
- Hoist ascending/stable dispatch outside the comparator lambda in
  awkward_argsort, matching the structure of awkward_sort so the
  branch is evaluated once per segment rather than once per comparison.

Assisted-by: ClaudeCode:claude-sonnet-4-6
Tests that ak.sort and ak.argsort on int64/uint64 values > 2^53 match
numpy exactly, verifying the double-cast precision bug is fixed.
Also covers NaN float handling (unchanged), bool sort, descending order,
stable sort, and multiple sublists.

Assisted-by: ClaudeCode:claude-sonnet-4-6
- Remove descriptive comments in C++ that restate what code does; keep
  the non-obvious constraint comment about specialization ordering.
- Trim 13 regression tests to 5: one per distinct concern (int64 sort,
  int64 argsort, uint64 sort, float NaN unchanged, bool specialization).
- Rename test file to use underscores to pass validate-test-names hook.

Assisted-by: ClaudeCode:claude-sonnet-4-6
@henryiii henryiii force-pushed the henryiii/fix-sort-comparators branch from 9e85a02 to 60c6f74 Compare June 11, 2026 19:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants