feat: add `illico`for `rank_genes_groups` by ilan-gold · Pull Request #4038 · scverse/scanpy

ilan-gold · 2026-04-07T14:44:25Z

TODOs:

See: https://github.com/scverse/scanpy/actions/runs/24088645078/job/70268566419?pr=4038

Investigate seemingly mismatched ovo "column" in scores (cc @remydubois)
Maybe get a compat layer for the 1e-9 helper in illico (see the filterwarnings on the new test)
Investigate why ovr results are so far off
Once we can match our current implementation (along with the new exp_post_agg feature feat: allow exponentiation post agg for log-fold-change in rank_genes_groups #4037) to illico, we will add the illico backend (or all changes have been documented)

Closes Integration with illico #4012
Tests included or not required because:

Release notes not necessary because:

codecov · 2026-04-07T14:45:33Z

Codecov Report

❌ Patch coverage is 94.44444% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 79.89%. Comparing base (ecec3c2) to head (0940df2).
✅ All tests successful. No failed tests found.

Files with missing lines	Patch %	Lines
src/scanpy/tools/_rank_genes_groups.py	94.11%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #4038      +/-   ##
==========================================
+ Coverage   79.85%   79.89%   +0.03%     
==========================================
  Files         121      121              
  Lines       12924    12938      +14     
==========================================
+ Hits        10321    10337      +16     
+ Misses       2603     2601       -2

Flag	Coverage Δ
hatch-test.low-vers	`79.02% <94.44%> (+0.04%)`	⬆️
hatch-test.pre	`79.84% <94.44%> (+0.02%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines	Coverage Δ
src/scanpy/_settings/presets.py	`90.84% <100.00%> (ø)`
src/scanpy/tools/_rank_genes_groups.py	`94.11% <94.11%> (+0.78%)`	⬆️

remydubois · 2026-04-07T17:16:47Z

Hey, good catch for the OVO column ordering. I have no idea where the bug comes from, it seems to be related to group labels for PBMC that numpy sorts in a weird way. I will try to get a fix in the coming days.

remydubois · 2026-04-08T22:25:44Z

So it seems to be due to np.argsort and np.sort not sorting the weird characters (" ", "+", "/", "#", etc) of the bulk_labels the same way. I will ship a fix sometime today or this weekend.

NB: I spent a bit of time looking into the PBMC dataset and it seems rather unusual: a lot of values are negative leading to non-defined logfoldchanges, on top of which there seems to be very little value diversity in the dataset. As a result, a lot of genes end up with identical ranksum, but because illico and scanpy do not compute z-score the exact same way (mathematically equivalent but programmatically different), gene ordering is impacted. The gene ordering impacts the position of NaN values in the results, but non-NaN values do match because they are very similar. I am not sure as if this dataset is a good test case.

NB2: details of the programmatic differences:
illico does:

U = ranksum - n_tgt * (n_tgt+ 1)/2
z = U - n_ref * n_tgt / 2

scanpy does:

z = 
ranksum - n_tgt * (n_tgt + 1 + n_ref) / 2

Those are mathematically equivalent but result in differences of the order of 1.e-9 approximately, changing gene orders when ranksums are equal.

ilan-gold · 2026-04-09T13:45:47Z

I am not sure as if this dataset is a good test case.

Interesting, good observation but glad we are aware of this now. This could be another target for scanpy 2.0.

I will try a different dataset but glad to have this documented.

remydubois · 2026-04-09T21:27:26Z

Forget my previous message which was actually kind of off topic.

This PBMC dataset was actually a very good test case.

It unveiled a silent bug in illico as it seems np.argsort and np.sort are not sorting the weird characters (" ", "+", "/", "#", etc) of the bulk_labels the same way. I will ship a fix in the next few days.
Because it has very limited diversity in the data (the sheer values in .X are not very diverse), a lot of genes end up with identical ranksums (see below example), hence, identical z-scores.
The sorting method used by scanpy is more elaborate as it allows user to select only top n genes. As a result, it does not sort identical values in the same order as illico (even if all genes are returned). That explains the genes ordering difference that I was still facing after fixing the name sorting issue described in 1. I believe I can add the functionality to return n_genes only like scanpy does, and implement the same sorting routine as scanpy, which appears to solve that issue.
Not related to PBMC, but z-scores also mismatched because scanpy casts them to float32, and illico keeps thems as float64. I will fix that in the next patch as well.
For the 1.e-9 adjustment, I don't know what's the best way to go. Currently, illico handles this with a np.where(mu_ref == 0, np.inf, mu_tgt / mu_ref) but I'm quite open to even change what's currently in illico if this e-9 offset is the de-facto standard in other softwares like Seurat or so.

Example: let's take CD14+ Monocyte as control group, CD34+ as perturbed group, and look at the genes CDK6 and TCAEL8. Although they don't exactly have the same values, both CDK6 and TCEAL8 have equal ranksums, hence, equal z-scores. Due to the sorting methodology difference, they end up not sorted in the same order between illico and scanpy.
From the next release, the test suite in illico will not only test closeness of the values but also explicitly test matching ordering of the genes.

NB: what is the situation with the Dask issue ? Current version is not dask-compatible.

ilan-gold · 2026-04-10T09:17:08Z

NB: what is the situation with the Dask issue ? Current version is not dask-compatible.

That should come next, I want to keep this scoped for now to the in-memory stuff.

Another thing if you're doing a batch of fixes this weekend - supporting cs{r,c}_array would be awesome. Should only be one line of code.

Also for your numba types, only if you want instead of named tuples, FAU has a plug-in for handling putting cs{c,r}_{array,matrix} into numba kernels: https://github.com/scverse/fast-array-utils/blob/main/src/fast_array_utils/_plugins/numba_sparse.py. But it might not be worth it to bring on the dependency just for this purpose

ilan-gold · 2026-04-10T09:37:41Z

For the 1.e-9 adjustment, I don't know what's the best way to go. Currently, illico handles this with a np.where(mu_ref == 0, np.inf, mu_tgt / mu_ref) but I'm quite open to even change what's currently in illico if this e-9 offset is the de-facto standard in other softwares like Seurat or so.

I'm seeing some instances of https://github.com/satijalab/seurat/blob/main/R/differential_expression.R#L1089 i.e., they calculate the mean expression with this "+1" offset so don't need the correction.

So we have three different things here, I guess. I think just giving the option to match scanpy (since that seems a project goal) is good, and then we can maybe revisit what seurat does as a new parameter for scanpy 2.0.

Not related to PBMC, but z-scores also mismatched because scanpy casts them to float32, and illico keeps thems as float64. I will fix that in the next patch as well.

For this, I would be open to also making this part of a scanpy 2.0 set of default i.e., using float64, as there are other places we might benefit from this. So if you were to keep float64, I don't think that would be a blocker. Again, an option could be good so we can smooth out the transition, although this is so small as to maybe not warrant it.

Overall, my ideal scenario would be that illico and scanpy with wilcoxon match, and then we use illico by default in scanpy 2.0. I'm pushing to minimize result changes to make the transition smoother (since results changing are always no fun) and so far the changes don't seem particularly destructiv.

Short of exact matches, like I said we can just document changes. It sounds like the biggest things that would close the gap are:

argsort bug
1e-9 option
The option to return n_genes
cs{r,c}_array support

I think we can patch the float{32,64} stuff in a general "numerical accuracy" scanpy 2.0 preset (which we have on main right now).

Overall, this is really cool and I'm super happy that things seem to be pretty close!

remydubois · 2026-04-12T18:25:44Z

Hey,

I just released version 0.5.0rc1 which should fix the issues identified on this test case:

Perturbation (or group) names are no longer re-sorted when outputs are formatted for scanpy ensuring they are ordered the same everywhere.
Fold change is now computed by adding the same 1.e-9 factor, on top of being accumulated into a f64 placeholder (regardless of the original data dtype) ensuring a better matching.
New argument n_genes allowing to return only top n DE genes per perturbation (which, even if not specified, results in genes sorting methodology being identical. This solves the issue raised by genes having identical scores).
Support for cs[cr]_array. I did not explicitely add test cases for those as the test suite is already quite heavy and illico does nothing more than accessing .data, .indices, and .indptr of those objects. I did test it manually quickly and it ran with no issue.

Regarding the numerical precision question: so far I force the recarrays to have the same dtype as what's currently implemented in scanpy, see FC for instance. I believe we could change that when the need comes as you say.

Testing locally, it seems everything now runs smoothly for PMBC. Let me know how it goes with the rest of the CI.

ilan-gold · 2026-04-13T13:41:19Z

@remydubois I think the issue is that you are sorting / cutting off by n_top genes globally but scanpy does this per group, but that is just a guess.

In any case, I think I got a little lost-in-the-sauce. I think matching p-values and scores should be enough because scanpy can internally handle LFC when it wraps illico. I just pushed a commit with this change. Of course, if you want to match LFC, no argument from me, but I realized it's probably not essential for internal consistency.

I'm pushing the results of scores and pvals_adj which all match except the two cases highlighted in the tests.

With this in mind, I am going to push ahead with integrating illico, sorry for the churn on the log fold changes, but I think we can just let scanpy do it as along as I can get scores/pvals out of illico.

BTW: https://remydubois.github.io/illico/api.html looks both incomplete and like it's leaking internals.

ilan-gold · 2026-04-13T16:07:06Z

Ok everything passing locally with just getting P-value and z-score from illico. This is looking good now to me at least conceptually. Last thing would be to make sure mean-var calculation is fast since we won't rely on illico (at least for now) to do logfc, but I have something for that in #4041.

I assume the LFC calculation (i.e., actually calculating the means) is the less computationally intensive part?

remydubois · 2026-04-13T18:26:11Z

With this in mind, I am going to push ahead with integrating illico

Ok ! That's good news to me.

Because this was bugging me out a little bit: I checked out on your ig/illico branch and indeed reproduced test failure, until I added a pbmc = ad.AnnData(pbmc.X.copy(), obs=pbmc.obs.copy(), var=pbmc.var.copy()) (in order to make sure pmbc gets stripped out of all of its metadata or attributes):

def test_illico(test, corr_method, exp_post_agg):
    from illico.asymptotic_wilcoxon import asymptotic_wilcoxon

    pbmc = pbmc68k_reduced()
    pbmc = ad.AnnData(pbmc.X.copy(), obs=pbmc.obs.copy(), var=pbmc.var.copy())
    # ... Rest of the test. They all pass on my machine

Then, all tests passed (LFC, pvals, scores, pvals_adj, up to atol=1.e-9) which I still consider good news, even with in mind that LFC will remain scanpy-computed (non-matching LFC might mean non-matching names, which would have been a real issue). The reason why my "tests" passed locally before shipping 0.5.0rc1 for PBMC was exactly because I was stripping it entirely before running and comparing outputs, just to be sure.

By the way, wouldn't it be safer for scanpy's test suite to add an explicit check on the ordering of the genes, like I did here (which did fail, before stripping pbmc) ?

This is looking good now to me at least conceptually

Very happy to read it's moving forward 🚀

I assume the LFC calculation (i.e., actually calculating the means) is the less computationally intensive part?

Yes it's neglectable, I could not measure its impact on the overall runtime.

ilan-gold · 2026-06-16T13:39:17Z

@remydubois Everything seems to pass with groups - I think exclude_from_ovr + dask should be follow-up PRs because both apply to all tests, not just wilcoxon.

remydubois · 2026-06-20T02:48:12Z

@ilan-gold good news !

I think exclude_from_ovr + dask should be follow-up PRs

That makes sense. Does it mean any specifics for illico releases (i.e: should I release/not release some features in the v0.6.0) ?

ilan-gold · 2026-06-22T10:08:57Z

That makes sense. Does it mean any specifics for illico releases (i.e: should I release/not release some features in the v0.6.0) ?

I think you should be pretty good releasing what you want - we can always catch up here e.g., when we do a proper assessment of dask.

flying-sheep

I have a few questions; the code looks great (apart from a stylistic nitpick)

flying-sheep · 2026-06-26T12:28:12Z

 dynamic = [ "version" ]
 dependencies = [
-    "anndata>=0.10.8",
+    "anndata>=0.11",


which feature does this PR need?

This comes from illico but it seemed as good a time as any to require this. I will split this out into a separate PR though, fair point, there are probably version checks floating around our code base

There is no reason for illico to require 0.11. I can lower the lower bound to 0.10.8 in the next release (v0.6.0).

No worries, really, We should upgrade to 0.11 anyway for the next scanpy minor release. I'd rather keep the community moving on this front.

Co-authored-by: Philipp A. <flying-sheep@web.de>

ilan-gold · 2026-07-02T15:20:01Z

@flying-sheep Re-requesting your review here because I want two things:

illico should be default for scanpy 2.0, so in the preview as well
wilcoxon_illico should not be kept as a parameter - wilcoxon should just become wilcoxon_illico

So to that end (see previous commit), if we are in preview-mode, I set it manually inside rank_genes_groups. Any thoughts on this? This could be a great carrot to get people to switch if we don't even have a public wilcoxon_illico parameter and force people to use scanpy v2 to get it.

flying-sheep · 2026-07-03T13:28:22Z

        method = settings.preset.rank_genes_groups.method
+        if settings.preset is Preset.ScanpyV2Preview:
+            method = "wilcoxon_illico"


Sorry for the tone of this comment, I’m not intending it to be as sassy as it reads lol

that makes no sense.

This configuration exists exactly for line 720 do do what you’re now doing in the two lines after.

scanpy/src/scanpy/_settings/presets.py

Lines 241 to 243 in ecec3c2

Preset.ScanpyV2Preview: RankGenesGroupsPreset(

method="wilcoxon", mask_var=None, mean_in_log_space=False

),

So that configuration is now a lie and ignored because you hardcode the default here instead, why?

Re-requesting your review here

if you press the button, I’ll actually see that!

Apologies, just forgot to push the button after the comment :)

My thinking goes that I do not want people using wilcoxon_illico parameter and therefore don’t want it hardcoded as the scanpy 2.0 default in the preset.

I would start a deprecation cycle for wilcoxon_illico once 2.0 gets closer.

But I think this might just be too complicated and not worth the slightly-cleaner preset appearance.

We could just make the preset itself wilcoxon_illico and then deprecate people passing in willcoxon_illico manually, instead pointing the preset.

Sorry for the confusion :/ This was sort of a bad middle ground between having wilcoxon_illico with no plan for it, and not having it at all, instead relying purely on the preset.

Oof, I must have failed to read half of your comment above, sorry! I get it now!

But wouldn’t it be easier to just not have "wilcoxon_ilico" at all then? Just check for if method == "wilcoxon" and settings.preset == Preset.ScanpyV2Preview?

ilan-gold added 2 commits April 7, 2026 16:00

feat: allow exponentiation post agg for log-fold-change

c5591c8

feat: add illico

7aca4b3

ilan-gold changed the title ~~feat: add illico~~ feat: add illicofor rank_genes_groups Apr 7, 2026

ilan-gold added 2 commits April 7, 2026 16:44

Merge branch 'main' into ig/exp_post_agg

c2f3738

Merge branch 'ig/exp_post_agg' into ig/illico

c80958a

ilan-gold added 6 commits April 7, 2026 16:51

fix: bump numba

72318fb

Merge branch 'ig/illico' of github.com:scverse/scanpy into ig/illico

97b4f7c

chore: probably not either

b9c8257

chore: now pandas

5394d2b

fix: anndata

8928dfd

fix: just stable then

897a646

ilan-gold added 3 commits April 13, 2026 11:38

fix: pin rc

af1f523

fix: agg name

40d5946

fix: only consider scores and pvals

74b6d87

chore: p values and z scores only

8352445

fix: point an low-vers safe version

1cad431

ilan-gold mentioned this pull request Apr 13, 2026

fix: weaken bounds remydubois/illico#15

Merged

ilan-gold added 2 commits April 13, 2026 19:13

Merge branch 'main' into ig/exp_post_agg

f0d78b4

Merge branch 'ig/exp_post_agg' into ig/illico

d9ad811

ilan-gold added 2 commits June 16, 2026 15:17

Merge branch 'ig/exp_post_agg' into ig/illico

5cc6647

feat: use groups argument

7ab15f9

Base automatically changed from ig/exp_post_agg to main June 16, 2026 13:31

Merge branch 'main' into ig/illico

3d19c95

ilan-gold marked this pull request as ready for review June 22, 2026 10:09

chore: p-value alteration

2363ee7

flying-sheep reviewed Jun 26, 2026

View reviewed changes

ilan-gold and others added 3 commits June 29, 2026 16:07

Update tests/test_rank_genes_groups.py

b522c71

Co-authored-by: Philipp A. <flying-sheep@web.de>

fix: address comments

dbbe64d

Merge branch 'main' into ig/illico

4216aad

ilan-gold added this to the 1.13.0 milestone Jun 29, 2026

ilan-gold added 3 commits June 30, 2026 11:19

test

98e3d71

Merge branch 'ig/illico' of github.com:scverse/scanpy into ig/illico

264c683

Merge branch 'main' into ig/illico

c99dd41

ilan-gold mentioned this pull request Jun 30, 2026

chore: anndata 0.11 as min #4191

Merged

3 tasks

ilan-gold added 4 commits July 1, 2026 13:05

Merge branch 'main' into ig/illico

0c33493

Merge branch 'main' into ig/illico

59b9119

Merge branch 'main' into ig/illico

192d3ea

feat: illico as default v2

4479fc0

ilan-gold added 5 commits July 2, 2026 17:20

fix: no illico 0.6.0

1492048

pin illico

e808e69

chore: relnote

ef18687

Merge branch 'main' into ig/illico

a623d40

intersphinx

0940df2

flying-sheep reviewed Jul 3, 2026

View reviewed changes

ilan-gold requested a review from flying-sheep July 3, 2026 14:13

	Preset.ScanpyV2Preview: RankGenesGroupsPreset(
	method="wilcoxon", mask_var=None, mean_in_log_space=False
	),

Uh oh!

Conversation

ilan-gold commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

TODOs:

Uh oh!

codecov Bot commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

remydubois commented Apr 7, 2026

Uh oh!

remydubois commented Apr 8, 2026

Uh oh!

ilan-gold commented Apr 9, 2026

Uh oh!

remydubois commented Apr 9, 2026

Uh oh!

ilan-gold commented Apr 10, 2026

Uh oh!

ilan-gold commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

remydubois commented Apr 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ilan-gold commented Apr 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ilan-gold commented Apr 13, 2026

Uh oh!

remydubois commented Apr 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ilan-gold commented Jun 16, 2026

Uh oh!

remydubois commented Jun 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ilan-gold commented Jun 22, 2026

Uh oh!

flying-sheep left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

flying-sheep Jun 26, 2026

Choose a reason for hiding this comment

Uh oh!

ilan-gold Jun 29, 2026

Choose a reason for hiding this comment

Uh oh!

remydubois Jul 1, 2026

Choose a reason for hiding this comment

Uh oh!

ilan-gold Jul 2, 2026

Choose a reason for hiding this comment

Uh oh!

ilan-gold commented Jul 2, 2026

Uh oh!

flying-sheep Jul 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ilan-gold Jul 3, 2026

Choose a reason for hiding this comment

Uh oh!

flying-sheep Jul 3, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

ilan-gold commented Apr 7, 2026 •

edited

Loading

codecov Bot commented Apr 7, 2026 •

edited

Loading

ilan-gold commented Apr 10, 2026 •

edited

Loading

remydubois commented Apr 12, 2026 •

edited

Loading

ilan-gold commented Apr 13, 2026 •

edited

Loading

remydubois commented Apr 13, 2026 •

edited

Loading

remydubois commented Jun 20, 2026 •

edited

Loading

flying-sheep left a comment •

edited

Loading

flying-sheep Jul 3, 2026 •

edited

Loading