feat(clustering): add Louvain and Leiden community detection#9
Open
seilat wants to merge 4 commits into
Open
Conversation
Add LouvainClustering implementing ClusteringAlgorithm<V> in org.jgrapht.alg.clustering: alternating local-moving and aggregation phases that greedily maximise modularity, with weighted/self-loop handling and modularity conventions matching UndirectedModularityMeasurer (reused for getModularity()). Constructors mirror LabelPropagationClustering ((graph), (graph, rng), (graph, rng, tolerance)). - 15 JUnit5 tests: cliques->2, ring->3, complete->1 (Q=0), weighted, seed determinism, single/empty/isolated, self-loops, random partition-validity, getModularity vs measurer, beats all-singletons. - JMH benchmark (perf/clustering) vs LabelPropagation and GreedyModularity on planted-partition graphs; export the new perf package(s) in the surefire argLine, matching the existing per-package pattern. checkstyle clean (etc/jgrapht_checks.xml); javadoc clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…weights Adversarial review follow-ups on the Louvain PR: - Zero-weight edges (2m == 0) made getModularity() return NaN because the modularity guard tested edgeSet().isEmpty() rather than the actual total weight. Guard on totalWeight > 0 instead; return 0 otherwise. - Reject negative edge weights with IllegalArgumentException at compute time (modularity is undefined for negative weights), documented on the class. - Replace the circular getModularityMatchesMeasurer test (getModularity delegates to the measurer) with a pinned exact value Q = 11/26 for the two-K4-plus-bridge graph; add regressions for zero-weight (no NaN) and negative-weight (throws). 17/17 tests pass; checkstyle clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…unities) Builds on Louvain (E2): - E2.0: extract package-private CommunityAggregation (compact / numberOfCommunities / degree-conserving aggregate that compacts labels internally) and refactor LouvainClustering to use it. Behaviour-preserving: Louvain's 17 tests stay green. The internal compaction removes the implicit 'caller must pass compact labels' coupling flagged in review, so Leiden can aggregate its non-compact refined partitions safely. - E2.1: LeidenClustering implementing ClusteringAlgorithm<V>. Multilevel local-moving + a refinement phase whose sub-communities grow only along edges (so each is connected), aggregation over the refined partition, and a final connected-components split that guarantees every reported community is connected -- the property Louvain lacks. Non-negative/zero-weight guards mirror Louvain. - E2.2: 16 tests incl. the headline everyCommunityIsConnected oracle (ConnectivityInspector per induced subgraph, structured + 30 random graphs), clique/ring/complete/weighted parity, determinism, edge cases, and a modularity-vs-Louvain comparison (competitive within 0.02; Leiden trades a sliver of modularity for guaranteed connectivity). 33/33 clustering tests pass; checkstyle + javadoc clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Leiden vs Louvain vs LabelPropagation vs GreedyModularity on planted partitions. Leiden costs ~1.4-1.5x Louvain (refinement + connectivity-split passes) while staying 7-9x faster than GreedyModularity; scaling parallels Louvain. Cost (ms/op): 125-node Louvain 0.28 / Leiden 0.42; 500-node Louvain 3.05 / Leiden 4.17. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This was referenced May 30, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Self-review PR (fork only — not upstream; holds for a
jgrapht-devsignal per project cadence). Supersedes #7 and #8, combining both modularity-based community-detection algorithms into one reviewable unit.What this adds to
org.jgrapht.alg.clusteringLouvainClustering— the Louvain method (Blondel et al. 2008): alternating local-moving + aggregation, greedily maximising modularity. Weighted/self-loop aware, non-negative weights,getModularity()consistent withUndirectedModularityMeasurer.LeidenClustering— the Leiden method (Traag-Waltman-van Eck 2019): adds a refinement phase so every reported community is well-connected — the guarantee Louvain lacks.CommunityAggregation— package-private shared primitive (compact / numberOfCommunities / degree-conserving aggregate that compacts labels internally), used by both.Both implement
ClusteringAlgorithm<V>with constructors mirroringLabelPropagationClustering:(graph),(graph, rng),(graph, rng, tolerance).Verification
33 tests (Louvain 17 + Leiden 16), all green. checkstyle 0 violations, javadoc clean.
Leiden's headline
everyCommunityIsConnectedoracle asserts connectivity viaConnectivityInspectoron each induced subgraph (structured + 30 random graphs).JMH 4-way cost bench (
perf/clustering) — ms/op on planted partitions:Leiden ≈ 1.4–1.5× Louvain (refinement + connectivity split), 7–9× faster than GreedyModularity.
Honest scope notes
resolutionparameter deferred.pom.xmlexports the newperf.clusteringpackage(s) in the surefire argLine (documented per-package pattern).Commits
featLouvain + tests + benchfixLouvain zero/negative-weight guards (review follow-up)featLeiden + CommunityAggregation refactor (Louvain behaviour-preserving) + connectivity oraclebenchLeiden added to the 4-way JMH bench🤖 Generated with Claude Code