Skip to content

[Bug]: CouchDB 3.5.1 upgrade leads to elevated memory and instability (OOM on single node + full cluster drop during rolling upgrade) #5879

@oisheeaa

Description

@oisheeaa

Version

3.5.1

Describe the problem you're encountering

After upgrading from CouchDB 3.4.2 -> 3.5.1, we are seeing stability regressions in both:

  • Single node environments (OOM kill crash)
  • 3 node clustered environments (full cluster outage during rolling upgrade)
    These environments were stable on 3.4.2 under same workloads
    The main symptoms include:
  • Increased Erlang VM RSS memory usage (beam.smp)
  • Very large Erlang process counts post-upgrade
  • CPU spikes during compaction/rebalancing
  • Cluster becoming unavailable during node cycling

Expected Behaviour

  • CouchDB 3.5.1 should not significantly increase baseline memory/process usage on the same workload
  • Single node instances should not be OOM killed under normal compaction load
  • Rolling node upgrades should not result in full cluster outage

Steps to Reproduce

We have observed the issues under the following conditions:

  • Upgrade CouchDB 3.4.2 → 3.5.1 (Erlang/OTP 26)
  • Run with medium memory nodes (~8GB RAM)
  • Large shard count / very high DB count
  • Compaction or shard movement occurring
  • Cycle one node during rolling upgrade (ASG refresh)

Result: memory/process growth and potential node loss -> cluster unavailability

Your Environment

Deployment

  • AWS EC2 instances with persistent couch_data
  • CouchDB nodes managed via ASG refresh (one node at a time)
  • HAProxy + Fauxton access through fronted endpoint

Cluster configuration:

  • 3-node cluster (n=3, q=1)
  • Placement: primary / secondary / trinary
  • Data size: ~319k databases
  • Disk usage: ~640GB used of ~1TB per node

Instance resources:
~7.6GB RAM

Additional Context

No response

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions