Consider this simple example:
In [1]: import numpy as np
In [2]: import bottleneck as bn
In [3]: data = 2e5*np.random.rand(int(4e7)).astype('float32')
In [4]: np.nansum(data)
Out[4]: 4000034300000.0
In [5]: bn.nansum(data)
Out[5]: 3719060258816.0
Looks like errors in the computation are compounding due to loss of precision, as the problem becomes much less apparent for smaller datasets. Repeating the above for the float64 dtype gives me much more consistent results.
In [6]: bn.nansum(data.astype('float64'))
Out[6]: 4000035580557.9033
In [7]: np.nansum(data.astype('float64'))
Out[7]: 4000035580557.979
I tested this example for bottleneck 1.1.0 and 1.2.1
Consider this simple example:
Looks like errors in the computation are compounding due to loss of precision, as the problem becomes much less apparent for smaller datasets. Repeating the above for the
float64dtype gives me much more consistent results.I tested this example for bottleneck 1.1.0 and 1.2.1