Skip to content

[Bug]: couch_peruser leaks mem3_cluster and init_changes_handler processes #5871

@NiloCK

Description

@NiloCK

Version

CouchDB version: 3.4.3 (dpkg on Ubuntu, single-node)

Describe the problem you're encountering

Image

Above memory use is from an idle server - the drop off is a restart of couchdb via systemctl restart couchdb.

Root cause:

init_state/0 in couch_peruser.erl spawns a new mem3_cluster:start_link/4 process on every cluster_unstable cast and update_config cast. exit_changes/1 cleans up change feed handlers but never terminates the previous mem3_cluster process. Old mem3_cluster processes remain alive and continue firing cluster_unstable events, each triggering another init_state/0 call, creating a feedback loop of process accumulation.

Evidence:

Erlang remote shell inspection on the running node:

  • process_count climbed from 4,097 to 4,176 in ~2 minutes on an idle system (70 HTTP requests total since last restart)
  • erlang:process_info showed 16,426 couch_peruser:init_changes_handler processes accumulating
  • couch_event_server (the event dispatcher) was the top memory consumer at 8.3 MB, growing as it tracked all registered handlers

Memory timeline (from system monitoring):

  • Post-restart: 3.2% RAM → 3.4% in 2 minutes
  • 6:14 PM: 39.7% → 2:24 AM: 87%+ (8 hours, +5.77%/hour)
  • Consistent across multiple restarts

_system snapshots (3.6 minutes apart):

  ┌───────────┬──────────────────────────┬────────────────────────────┐
  │   Field   │ Reading 1 (uptime 1593s) │  Reading 2 (uptime 1811s)  │
  ├───────────┼──────────────────────────┼────────────────────────────┤
  │ processes │ 63,622,424 bytes         │ 70,445,232 bytes (+6.8 MB) │
  ├───────────┼──────────────────────────┼────────────────────────────┤
  │ binary    │ 787,440                  │ 903,448                    │
  ├───────────┼──────────────────────────┼────────────────────────────┤
  │ ets       │ 1,302,568                │ 1,308,112                  │
  ├───────────┼──────────────────────────┼────────────────────────────┤
  │ code      │ 13,955,494               │ 13,955,494                 │
  └───────────┴──────────────────────────┴────────────────────────────┘

Workaround:

Disable couch_peruser (PUT _node/_local/_config/couch_peruser/enable → "false").

Expected Behaviour

Steady state memory consumption for an idle server.

Steps to Reproduce

Haven't manually reproduced, but roughly:

  • init 3.4.3 db
  • enable couchdb per_user
  • create some dbs, users
  • observe

Your Environment

  • Single-node CouchDB 3.4.3 (dpkg, Ubuntu 20.04)
  • 2 GB RAM
  • 97 databases (including ~80 userdb-* via couch_peruser)
  • Near-zero traffic
  • No active tasks, smoosh idle, no replication jobs

Additional Context

Related prior work:

PR #3851 attempted a fix by unlinking the old mem3_cluster process, but reviewer @nickva noted it was insufficient — unlink without kill still leaves processes alive and sending events. nickva recommended keeping a single mem3_cluster process for the gen_server's lifetime and only restarting change feed handlers on state resets.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions