Fix: float[] to float in legacy bandwidth#918
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #918 +/- ##
=======================================
+ Coverage 85.6% 85.9% +0.3%
=======================================
Files 151 151
Lines 16310 16524 +214
=======================================
+ Hits 13967 14192 +225
+ Misses 2343 2332 -11
🚀 New features to boost your workflow:
|
|
We'll need an example that reproduces the issue and which can be used in tests. If this is fixing some behaviour, a test should ensure that this is not reverted back in future. |
|
We could, instead, refactor the upstream kernel functions to take either a scalar or array and do the right thing. This way you could keep the original vectorized implementation. |
|
that also looks jittable? |
+1 on that. Which kernel causes trouble? I am still waiting for the reproducer :) |
|
@martinfleis , here is the explanation of the issue and how we can address it: i searched and thought about possible failure and the failure occurs when a user passes a custom user-defined callable as the def safe_custom_kernel(distances, bw):
# Prevents division by zero for co-located points
if bw < 0.001:
bw = 0.001
return np.exp(-distances / bw)a test for this, could be, for example: on testing with the vectorized apporach:
|
was thinking the exact same.. |
|
If it is only the case of custom callable, I think we should document that it should expect array input and behave correctly. |



In my last merged Pull Request #905:
on discussion with @sjsrey , here, it was passing bandwidth as an entire array of floats (float[]) directly into the kernel functions all at once for the entire sparse matrix. This caused issues because some custom kernel functions (or internal ones) expect bandwidth to be a single scalar float, which might lead to broadcasting errors.
Now, the code has been refactored to iterate through the data observation-by-observation (row-by-row):
i'm aware that numpy vectorization is definitely going to be faster, yet this change it's a necessary trade-off for correctness and API compliance. By iterating row by row, the code can pass a single scalar float to the kernel function.
So it completely resolves the issue by ensuring the kernel functions always receive a single float bandwidth, even when we are using adaptive (variable) bandwidths across the datasets..
I wrote a quick script to test both versions with a dataset of 100,000 points and a relatively large neighbor count ($k=50$ ), simulating a moderately intensive spatial operation.
Here are the benchmark results:
Older Code (Vectorized, but buggy): 11.2725 seconds
New Code (Iterative, fixed): 11.5369 seconds
it's only a 0.26 seconds tradeoff, i believe it's feasible to ensure api compliance.