Skip to content

fix(server-groups): evict stale cached ids#1053

Open
Rico Lin (ricolin) wants to merge 2 commits into
mainfrom
fix/server-group-cache-evict
Open

fix(server-groups): evict stale cached ids#1053
Rico Lin (ricolin) wants to merge 2 commits into
mainfrom
fix/server-group-cache-evict

Conversation

@ricolin

Copy link
Copy Markdown
Member

Summary

  • verify cached Nova server group IDs still exist before reusing them
  • evict server group cache entries after successful or already-missing deletes
  • add focused unit coverage for stale-cache refresh and delete eviction

Root Cause

Nodegroup recreation can happen quickly after a worker nodegroup delete. The previous server group ID may still be cached by (project_id, server_group_name) even after the Nova server group has been deleted. Reusing that stale ID renders an immutable CAPI OpenStackMachineTemplate with a server group Nova no longer has, causing worker server creation to fail.

Validation

  • uv run pytest magnum_cluster_api/tests/unit/test_utils.py -k 'server_group' -> 4 passed
  • git diff --check

Notes

A broader uv run pytest magnum_cluster_api/tests/unit/test_utils.py currently fails on an existing cloud-controller config expectation: the test expects lb-provider=amphora, while current code renders amphorav2. That failure is outside this server-group cache fix.

@ricolin Rico Lin (ricolin) marked this pull request as ready for review May 28, 2026 01:22
Rico Lin (ricolin) added a commit that referenced this pull request May 29, 2026
PR #1053 changes server-group cache behavior. Configure the shared OpenStack
client mock to return real string IDs for created server groups so the
disk-backed cache can serialize nodegroup test data instead of trying to pickle
a MagicMock id.

Signed-off-by: Rico Lin <rico@vexxhost.com>
Assisted-By: ChatGPT <noreply@openai.com>
Verify cached server group IDs against Nova before reusing them and
clear cache entries after successful or already-missing server group
deletes. This prevents nodegroup recreation from rendering an immutable
OpenStackMachineTemplate with a deleted server group ID.

Assisted-By: ChatGPT <noreply@openai.com>
Signed-off-by: Rico Lin <rlin@vexxhost.com>
PR #1053 changes server-group cache behavior. Configure the shared OpenStack
client mock to return real string IDs for created server groups so the
disk-backed cache can serialize nodegroup test data instead of trying to pickle
a MagicMock id.

Assisted-By: ChatGPT <noreply@openai.com>
Signed-off-by: Rico Lin <rlin@vexxhost.com>
Rico Lin (ricolin) added a commit that referenced this pull request May 29, 2026
PR #1053 changes server-group cache behavior. Configure the shared OpenStack
client mock to return real string IDs for created server groups so the
disk-backed cache can serialize nodegroup test data instead of trying to pickle
a MagicMock id.

Signed-off-by: Rico Lin <rico@vexxhost.com>
Assisted-By: ChatGPT <noreply@openai.com>
Signed-off-by: ricolin <rlin@vexxhost.com>
@ricolin Rico Lin (ricolin) force-pushed the fix/server-group-cache-evict branch from 52b44bc to 2678280 Compare May 29, 2026 01:03
Rico Lin (ricolin) added a commit that referenced this pull request Jun 16, 2026
PR #1053 changes server-group cache behavior. Configure the shared OpenStack
client mock to return real string IDs for created server groups so the
disk-backed cache can serialize nodegroup test data instead of trying to pickle
a MagicMock id.

Signed-off-by: Rico Lin <rlin@vexxhost.com>
Assisted-By: ChatGPT <noreply@openai.com>
Signed-off-by: ricolin <rlin@vexxhost.com>
@ricolin Rico Lin (ricolin) force-pushed the fix/server-group-cache-evict branch 2 times, most recently from fd14339 to 1b55a98 Compare June 16, 2026 00:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant