Skip to content

fix: reset follower with stale state before add-node to prevent "Node is not empty" loop#1734

Open
svils wants to merge 1 commit intoOT-CONTAINER-KIT:mainfrom
svils:fix/replication-addnode-not-empty
Open

fix: reset follower with stale state before add-node to prevent "Node is not empty" loop#1734
svils wants to merge 1 commit intoOT-CONTAINER-KIT:mainfrom
svils:fix/replication-addnode-not-empty

Conversation

@svils
Copy link
Copy Markdown

@svils svils commented Apr 7, 2026

Summary

  • Before redis-cli --cluster add-node, check if the follower has stale cluster state (knows other nodes) or data in db0
  • If so, issue CLUSTER RESET HARD to clear both, allowing add-node to succeed
  • The follower re-syncs from the master via full sync after joining

After a leader-only CLUSTER RESET during single-node bootstrap, the follower retains its previous cluster state and replicated data. add-node rejects it with "Node is not empty", causing an infinite error loop on every reconcile cycle.

Fixes #1733
Related: #1407

Test plan

  • Unit tests for resetFollowerIfNotEmpty covering: empty node (no-op), stale cluster state, keys in db0, error paths
  • Full k8sutils test suite passes
  • Manual verification on a single-leader RedisCluster (clusterSize: 1 with 1 follower) — delete the leader PVC to trigger bootstrap, confirm follower is reset and re-added automatically

… is not empty" loop

ExecuteRedisReplicationCommand uses `redis-cli --cluster add-node` to
join followers to the cluster. This command requires the target node to
be completely empty — no cluster state and no keys in database 0.

After a leader-only CLUSTER RESET during single-node bootstrap (via
executeFailoverCommand), the follower retains its previous cluster
state and replicated data. The operator then enters an infinite error
loop:

  [ERR] Node <cluster>-follower-0...is not empty. Either the node
  already knows other nodes (check with CLUSTER NODES) or contains
  some key in database 0.

Add resetFollowerIfNotEmpty: before running add-node, check whether the
follower knows other nodes or has keys in db0. If so, issue CLUSTER
RESET HARD to clear both cluster state and data so add-node can succeed.
The follower will re-sync from the master via full sync after joining.

Related: OT-CONTAINER-KIT#1407

Signed-off-by: svils <63684363+svils@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ExecuteRedisReplicationCommand fails with "Node is not empty" after single-leader CLUSTER RESET

1 participant