Fix Leiden multi-pair null-model normalisation to match Louvain (#111)#115
Open
arnaudon wants to merge 1 commit into
Open
Fix Leiden multi-pair null-model normalisation to match Louvain (#111)#115arnaudon wants to merge 1 commit into
arnaudon wants to merge 1 commit into
Conversation
For a null model with more than one pair (n_null >= 2), the Leiden
backend optimised edges - (1/n_null) * sum_k pair_k while the Louvain
backend optimised edges - sum_k pair_k. These agree only for n_null = 1,
so the two backends could in principle disagree on the argmax for
multi-pair constructors (signed_modularity, and linearized_directed with
alpha < 1).
Scale node_sizes by sqrt(n_null) before constructing CPMVertexPartition
in both _optimise and _evaluate_quality. The CPM null term is quadratic
in node_sizes, so each layer then contributes n_null * pair_k; after
Leiden's 1/n_null averaging this yields sum_k pair_k, matching Louvain.
Also fix a shape bug in constructor_signed_modularity: the sparse
.sum(1).flatten() returned a (1, n) np.matrix row, producing a
(4, 1, n) null model that crashed the Leiden path ("Node size vector
not the same size as the number of nodes"). Wrapping in np.asarray
yields the expected (4, n) shape. The underlying float buffer is
byte-identical and shape[0] is unchanged, so the Louvain path is
unaffected; the signed_modularity fixtures are regenerated to drop the
spurious extra dimension (values unchanged).
Add a regression test isolating the null-model contribution that
verifies the Leiden and Louvain objectives now agree for n_null = 2.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #111.
Problem
For a null model with more than one pair (
n_null = len(null_model)/2 >= 2), the Leiden and Louvain backends optimised different objectives:edges − (1/n_null)·Σₖ pairₖedges − Σₖ pairₖThese agree only for
n_null = 1. Forn_null >= 2they differ by a partition-dependent term, so the two backends could in principle disagree on the argmax. Affected constructors:signed_modularity(always 2 pairs) andlinearized_directedwithα < 1(after #110).Fix (Solution 1 from the issue)
Scale
node_sizesby√n_nullbefore constructingCPMVertexPartition, in both_optimiseand_evaluate_quality. The CPM null term is quadratic innode_sizes, so each layer then contributesn_null·pairₖ; after Leiden's1/n_nullaveraging this yieldsΣₖ pairₖ, matching Louvain.Prerequisite bug fix in
constructor_signed_modularityWhile verifying on
signed_modularity(the issue's motivating reproducer) I found the Leiden path was completely broken there:adj_pos.sum(1).flatten()on the sparse matrix returns a(1, n)np.matrixrow, producing a(4, 1, n)null model that crashed Leiden with "Node size vector not the same size as the number of nodes." Wrapping innp.asarray(...)— as the other constructors already do — yields the correct(4, n)shape.The underlying float buffer is byte-identical and
shape[0]is unchanged, so the Louvain path is unaffected. Thesigned_modularityfixtures are regenerated to drop the spurious extra dimension (values unchanged).Tests
test__evaluate_quality_leiden_multi_pair_normalisation, which isolates the null-model contribution (Q(nm1) − Q(nm2)with two identical symmetric pairs) and verifies the Leiden and Louvain objectives agree forn_null = 2. Confirmed it fails without the fix.Out of scope
A separate, deeper limitation remains (acknowledged in the issue): Leiden's CPM uses a single
node_sizesvector per layer (nᵢ·nⱼ), so it only exactly represents symmetric null pairs.signed_modularity's pairs are asymmetric, so absolute Leiden/Louvain values still differ there by a partition-dependent term. This PR fixes only the1/n_nullnormalisation factor that #111 targets.https://claude.ai/code/session_01JKbhASQT3f4MRAJbTfkCph
Generated by Claude Code