Fix jaccard scaling#700
Conversation
Signed-off-by: Intron7 <severin.dicks@icloud.com>
Signed-off-by: Intron7 <severin.dicks@icloud.com>
|
@coderabbitai review |
✅ Action performedReview finished.
|
📝 WalkthroughSummary by CodeRabbit
WalkthroughAdds a CUDA ChangesJaccard CUDA path
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes 🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (3 passed)
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
src/rapids_singlecell/preprocessing/_neighbors/_neighbors.py (1)
234-247: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winImport the module through
rapids_singlecell._cudato keep the lazy-loading contract.
from rapids_singlecell._cuda._jaccard_cuda import jaccard_shared_countsimports the compiled submodule directly, bypassing the__getattr__lazy loader in_cuda/__init__.pythat maps "absent → None" and "present-but-failed → contextual ImportError". Prefer the package-level import so behavior stays consistent with the rest of the codebase.♻️ Suggested import style
- from rapids_singlecell._cuda._jaccard_cuda import jaccard_shared_counts + from rapids_singlecell._cuda import _jaccard_cuda as _jaccard @@ - jaccard_shared_counts( + _jaccard.jaccard_shared_counts( knn, n_obs=n_obs, k=n_neighbors, jaccard_vals=jaccard_vals, stream=cp.cuda.get_current_stream().ptr, )As per coding guidelines: "Import
_cudamodules viarapids_singlecell._cuda."🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/rapids_singlecell/preprocessing/_neighbors/_neighbors.py` around lines 234 - 247, The Jaccard CUDA import in the neighbors preprocessing path bypasses the `_cuda` package lazy-loading contract. Update the import in the code that calls `jaccard_shared_counts` to go through `rapids_singlecell._cuda` instead of importing `._jaccard_cuda` directly, so `_cuda.__getattr__` continues to control “absent vs failed import” behavior consistently across the codebase.Source: Coding guidelines
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/rapids_singlecell/_cuda/jaccard/jaccard.cu`:
- Around line 10-19: The jaccard shared-counts launch currently trusts n_obs and
k without validating that knn actually has n_obs * k elements, which can lead to
out-of-bounds GPU reads. Add a pre-launch size check in
launch_jaccard_shared_counts and the caller/binding that passes knn into
jaccard_shared_counts_kernel, using the existing n_obs/k contract to verify the
array length before launching. If the size does not match, fail fast with a
clear validation error rather than invoking the kernel.
In `@src/rapids_singlecell/_cuda/jaccard/kernels_jaccard.cuh`:
- Around line 33-34: The Jaccard kernel in the computation that writes to
jaccard_vals needs a zero-denominator guard and the stray empty statement should
be removed. Update the division logic so the result is only computed when the
denominator from the k/c terms is nonzero, and otherwise write a safe zero value
to keep NaN from propagating into the later mask and COO output. Make the fix in
the kernel body around jaccard_vals and keep the arithmetic path numerically
stable.
---
Nitpick comments:
In `@src/rapids_singlecell/preprocessing/_neighbors/_neighbors.py`:
- Around line 234-247: The Jaccard CUDA import in the neighbors preprocessing
path bypasses the `_cuda` package lazy-loading contract. Update the import in
the code that calls `jaccard_shared_counts` to go through
`rapids_singlecell._cuda` instead of importing `._jaccard_cuda` directly, so
`_cuda.__getattr__` continues to control “absent vs failed import” behavior
consistently across the codebase.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: b86a86c0-4aeb-4dd1-be39-70e4edb49285
📒 Files selected for processing (6)
CMakeLists.txtdocs/release-notes/0.16.0.mdsrc/rapids_singlecell/_cuda/__init__.pysrc/rapids_singlecell/_cuda/jaccard/jaccard.cusrc/rapids_singlecell/_cuda/jaccard/kernels_jaccard.cuhsrc/rapids_singlecell/preprocessing/_neighbors/_neighbors.py
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #700 +/- ##
==========================================
- Coverage 90.31% 90.30% -0.01%
==========================================
Files 104 104
Lines 9570 9565 -5
==========================================
- Hits 8643 8638 -5
Misses 927 927
|
Signed-off-by: Intron7 <severin.dicks@icloud.com>
No description provided.