Problem
When a constructor returns a null model with more than one pair (n_null = len(null_model)/2 > 1), the Leiden and Louvain backends optimise different objectives.
In src/pygenstability/pygenstability.py:494-518 (_optimise(method=\"leiden\", ...)):
n_null = int(len(null_model) / 2)
for null in null_model[::2]:
partitions.append(
leidenalg.CPMVertexPartition(
G,
weights=quality_values, # SAME quality for every layer
node_sizes=null.tolist(), # different null pair per layer
correct_self_loops=True,
)
)
stability = sum(p.quality() for p in partitions) / n_null
stability += optimiser.optimise_partition_multiplex(
partitions, layer_weights=n_null * [1.0 / n_null]
)
Per-layer Leiden quality is edges(H) − pair_k(H) (CPM with the layer's node_sizes). The total objective is
(1/n_null) · Σ_k [edges(H) − pair_k(H)] = edges(H) − (1/n_null)·Σ_k pair_k(H)
The C++ Louvain backend (generalizedLouvain/CPP/cliques/stability_gen.h:42-48) computes
These agree iff n_null = 1. For n_null ≥ 2 they differ by (1 − 1/n_null) · Σ pair_k(H), which depends on the partition — so it's not just an absolute scale, the argmax can differ in principle.
Affected constructors
Empirically the existing signed_modularity users haven't reported a problem, so the partition argmax may be close in practice — but it's still a real semantic asymmetry between backends.
Options
-
Scale node_sizes by √n_null before constructing CPMVertexPartition. The CPM null-model contribution is quadratic in ns, so scaling by √n_null makes each layer contribute n_null · pair_k(H); after Leiden's 1/n_null averaging this yields Σ pair_k(H), matching Louvain.
- One-line change in
_optimise.
- Changes
signed_modularity numerics → needs fixture regeneration.
-
Use layer_weights=[1.0]*n_null and somehow subtract the duplicated edge contribution.
- Harder:
weights=quality_values is per-layer, so the same edge term appears n_null times. No clean way to remove the duplication without changing the per-layer quality semantics.
-
Document and leave. Status quo. Acknowledge that for n_null ≥ 2 the Leiden path optimises a slightly different objective than Louvain.
Context
Came up while implementing #110 (sparse linearized_directed for α<1, which adds a second null pair). I opted for (3) there to keep the PR focused on the memory fix, but worth deciding the right long-term answer separately.
Reproducer (qualitative): on signed_modularity with a small signed graph, run pygenstability.run(..., method=\"louvain\") and method=\"leiden\" with the same seeds and compare absolute stabilities — they will not match scale.
Problem
When a constructor returns a null model with more than one pair (
n_null = len(null_model)/2 > 1), the Leiden and Louvain backends optimise different objectives.In
src/pygenstability/pygenstability.py:494-518(_optimise(method=\"leiden\", ...)):Per-layer Leiden quality is
edges(H) − pair_k(H)(CPM with the layer'snode_sizes). The total objective isThe C++ Louvain backend (
generalizedLouvain/CPP/cliques/stability_gen.h:42-48) computesThese agree iff
n_null = 1. Forn_null ≥ 2they differ by(1 − 1/n_null) · Σ pair_k(H), which depends on the partition — so it's not just an absolute scale, the argmax can differ in principle.Affected constructors
signed_modularity(always 2 pairs) — pre-existing.linearized_directedafter linearized_directed: sparse path for alpha<1 (issue #69) #110 (1 pair whenα=1, 2 pairs whenα<1) — newly affected, and inconsistent across the same constructor depending onα.Empirically the existing
signed_modularityusers haven't reported a problem, so the partition argmax may be close in practice — but it's still a real semantic asymmetry between backends.Options
Scale
node_sizesby√n_nullbefore constructingCPMVertexPartition. The CPM null-model contribution is quadratic inns, so scaling by√n_nullmakes each layer contributen_null · pair_k(H); after Leiden's1/n_nullaveraging this yieldsΣ pair_k(H), matching Louvain._optimise.signed_modularitynumerics → needs fixture regeneration.Use
layer_weights=[1.0]*n_nulland somehow subtract the duplicated edge contribution.weights=quality_valuesis per-layer, so the same edge term appearsn_nulltimes. No clean way to remove the duplication without changing the per-layer quality semantics.Document and leave. Status quo. Acknowledge that for
n_null ≥ 2the Leiden path optimises a slightly different objective than Louvain.Context
Came up while implementing #110 (sparse
linearized_directedforα<1, which adds a second null pair). I opted for (3) there to keep the PR focused on the memory fix, but worth deciding the right long-term answer separately.Reproducer (qualitative): on
signed_modularitywith a small signed graph, runpygenstability.run(..., method=\"louvain\")andmethod=\"leiden\"with the same seeds and compare absolute stabilities — they will not match scale.