Problem
The auto-generated acknowledgements page lists the same person multiple times when they commit under different name spellings. In the 8.1.0 release docs (PR #129), Stephen Waite appears three separate times:
steve waite (16 commits)
Stephen Waite (12 commits)
stephen waite (1 commit)
and cdx@rolling.ventures shows up as a raw-email author name (this is Chris Dickman, who also appears under his real name).
Root cause
AcknowledgementsGenerator (tools/release-docs/src/AcknowledgementsGenerator.php) groups contributors with git shortlog -sn, which keys on the author name string. Any variation in spelling/capitalization splits one person into multiple rows.
Proposed fix
Group commits by author email (a stable per-person identity) instead of by name, then choose one display name per email.
- Source the data with
git log --no-merges --format=%aE%x09%aN <from>..<to> instead of git shortlog -sn.
- Aggregate by lowercased email; sum commit counts across all name spellings tied to that email.
- Display name per email = the most-used spelling (the name on the most commits for that email); tie-break by longest name, then alphabetical. e.g. Stephen Waite collapses to
steve waite (29 commits).
- Sort the list by commit count descending, then name ascending.
generate() / render() keep their current signatures and the {name, commits} shape, so the CLI and Taskfile need no changes. The existing unit tests + fixtures (tests/AcknowledgementsGeneratorTest.php, tests/fixtures/acknowledgements/) need updating to cover email grouping (including case-insensitive email match and multi-spelling merge).
Notes / out of scope
- Bot accounts (
dependabot[bot], Copilot, openemr-release-bot[bot]) still appear after this change — email grouping keeps each as a single entry but does not remove them. Filtering bots is a separate concern; file separately if wanted.
- A
.mailmap in openemr/openemr would also canonicalize identities and is complementary, but the generator should merge by email regardless.
Affects: generated output such as content/acknowledgements/*.md (force-updated by dispatch — fix the generator, not the output PR).
Problem
The auto-generated acknowledgements page lists the same person multiple times when they commit under different name spellings. In the 8.1.0 release docs (PR #129), Stephen Waite appears three separate times:
steve waite(16 commits)Stephen Waite(12 commits)stephen waite(1 commit)and
cdx@rolling.venturesshows up as a raw-email author name (this is Chris Dickman, who also appears under his real name).Root cause
AcknowledgementsGenerator(tools/release-docs/src/AcknowledgementsGenerator.php) groups contributors withgit shortlog -sn, which keys on the author name string. Any variation in spelling/capitalization splits one person into multiple rows.Proposed fix
Group commits by author email (a stable per-person identity) instead of by name, then choose one display name per email.
git log --no-merges --format=%aE%x09%aN <from>..<to>instead ofgit shortlog -sn.steve waite (29 commits).generate()/render()keep their current signatures and the{name, commits}shape, so the CLI and Taskfile need no changes. The existing unit tests + fixtures (tests/AcknowledgementsGeneratorTest.php,tests/fixtures/acknowledgements/) need updating to cover email grouping (including case-insensitive email match and multi-spelling merge).Notes / out of scope
dependabot[bot],Copilot,openemr-release-bot[bot]) still appear after this change — email grouping keeps each as a single entry but does not remove them. Filtering bots is a separate concern; file separately if wanted..mailmapin openemr/openemr would also canonicalize identities and is complementary, but the generator should merge by email regardless.Affects: generated output such as
content/acknowledgements/*.md(force-updated by dispatch — fix the generator, not the output PR).