feat: support server_type=bm in cluster_api driver (auto-default boot_volume_size=0)#1014
feat: support server_type=bm in cluster_api driver (auto-default boot_volume_size=0)#1014Rico Lin (ricolin) wants to merge 6 commits into
Conversation
Magnum's driver dispatch matches a cluster_template against the (server_type, os, coe) tuples returned by each driver's provides() property. Until now the cluster_api drivers only advertised server_type=vm, so any cluster_template targeting baremetal nodes (e.g. Ironic via Nova) failed driver lookup with NoValidDriver even though Nova would happily route the boot via nova-compute-ironic when the flavor's resource_class matches a baremetal node. Add a parallel server_type=bm entry to all four shipped drivers (Ubuntu, UbuntuFocal, Flatcar, RockyLinux) so the same cluster_api driver handles both VM and BM clusters. This unblocks Magnum/CAPI on top of the Ironic emulator AIO scenario described in vexxhost/atmosphere#bm-mcapi-fixes (gap #2 in that branch's README). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Rico Lin <rlin@vexxhost.com>
Baremetal nodes provisioned through Ironic boot from the local disk via iPXE and cannot use a Cinder root volume. When a cluster_template has server_type=bm but the operator did not explicitly set the boot_volume_size label, the previous code defaulted to CONF.cinder.default_boot_volume_size, which caused OpenStackMachineTemplate to request a Cinder root volume that Ironic refuses to attach during enroll, breaking cluster creation with an opaque flavor/scheduling error. Add utils.get_default_boot_volume_size(cluster, default) that returns 0 when cluster.cluster_template.server_type == 'bm' and the supplied default otherwise, and use it as the default at the three call sites in resources.py (autoscaling annotation, machine_deployment override, cluster-level override). Operators can still override explicitly by setting boot_volume_size on the cluster_template/cluster/node_group; the auto-default only kicks in when the label is unset. This closes the BM cluster_template foot-gun documented as gap #17 in vexxhost/atmosphere bm-mcapi-fixes branch — operators no longer need to remember to pass --labels boot_volume_size=0 alongside server_type=bm. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Rico Lin <rlin@vexxhost.com>
Address review feedback on PR #1014. For server_type=bm, the boot volume default returns 0 (Ironic boots from local disk, not Cinder), and the fallback flavor.disk is typically also 0 because root_gb lives on the Ironic node rather than the Nova flavor. Previously the autoscaler annotation capacity.cluster-autoscaler.kubernetes.io/ephemeral-disk was set to "0", which causes cluster-autoscaler to reject any pod that requests ephemeral-storage even if the Ironic node has plenty of local disk. Skip the annotation entirely in that case so the autoscaler can fall back to cpu/memory-only scheduling decisions, which is correct for BM nodes whose disk capacity we cannot reliably determine here. Also document the auto-default in docs/user/labels.md. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Address review feedback on PR #1014. For server_type=bm, the boot volume default returns 0 (Ironic boots from local disk, not Cinder), and the fallback flavor.disk is typically also 0 because root_gb lives on the Ironic node rather than the Nova flavor. Previously the autoscaler annotation capacity.cluster-autoscaler.kubernetes.io/ephemeral-disk was set to "0", which causes cluster-autoscaler to reject any pod that requests ephemeral-storage even if the Ironic node has plenty of local disk. Skip the annotation entirely in that case so the autoscaler can fall back to cpu/memory-only scheduling decisions, which is correct for BM nodes whose disk capacity we cannot reliably determine here. Also document the auto-default in docs/user/labels.md. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Rico Lin <rlin@vexxhost.com>
Address review feedback on PR #1014. For server_type=bm, the boot volume default returns 0 (Ironic boots from local disk, not Cinder), and the fallback flavor.disk is typically also 0 because root_gb lives on the Ironic node rather than the Nova flavor. Previously the autoscaler annotation capacity.cluster-autoscaler.kubernetes.io/ephemeral-disk was set to "0", which causes cluster-autoscaler to reject any pod that requests ephemeral-storage even if the Ironic node has plenty of local disk. Skip the annotation entirely in that case so the autoscaler can fall back to cpu/memory-only scheduling decisions, which is correct for BM nodes whose disk capacity we cannot reliably determine here. Also document the auto-default in docs/user/labels.md. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: ricolin <rlin@vexxhost.com>
e0ec42b to
6301116
Compare
When patching mcapi via configMap subPath overlay, only the Python files are replaced -- shipped data files (manifests/calico/v*.yaml, etc.) still come from the installed image. If the feature branch has bumped CALICO_TAG past what the image ships, cluster create fails instantly with No such file or directory on the calico manifest. Documents the workaround (sed CALICO_TAG down to a shipped version before applying the configMap) and the proper fix (image rebuild or add the missing manifest to the configMap). Hit while validating PR vexxhost/magnum-cluster-api#1014 (head 6301116) on add-server-type-bm: resources.py references v3.31.5 but the deployed image only ships up to v3.31.3. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Change-Id: I3da483cd29c53d04dec58e41a002b4cc3627a877 Signed-off-by: Rico Lin <rico@vexxhost.com>
|
Result: ✅ End-to-end pass with PR #1014
|
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Rico Lin <rlin@vexxhost.com>
…fallback Three independent fixes that surfaced while running the G2 bare-metal smoke test against PR #1015 + PR #1014. See #1015 for the full discussion; the README at notes/logs/ironic/mcapi-bm/README.md captures the end-to-end repro. F1 — `_decode_extra_files_label` accepts plain YAML/JSON in addition to base64 Operators using YAML manifests (Heat, Terraform, kubectl) typically have the platform handle base64 wrapping, so being forced to manually wrap the list a second time was a frequent foot-gun. The decoder now tries base64 first (the documented/recommended path) and falls back to plain YAML/JSON, with a single clear error message when neither parses. F2 — `_normalize_extra_file` rejects pre-encoded content that omits `encoding: base64` Previously, an entry with `content: <base64>` and no `encoding` field was silently re-encoded, landing on disk as a base64 string and breaking any later `bash <script>` step. A narrow heuristic now detects this case (canonical base64 charset, length ≥ 16, multiple of 4, decodes to valid UTF-8) and raises `InvalidParameterValue` with a hint pointing operators at the `encoding` field. Tests cover plain scripts, short strings, and multi-line content to confirm no false positives. F3 — `fill_missing_labels_from_template` merges template labels into sparse cluster.labels Magnum's `cluster create --labels` REPLACES rather than merges the cluster_template labels. When a freshly-created cluster row was missing keys the Rust `ClusterLabels` deserialiser requires (most notably `kube_tag`), both create and delete crashed with `KeyError: 'kube_tag'` and the cluster ended up stuck in DELETE_FAILED. The new helper fills any template-only keys back in without overriding values the operator explicitly set, and is invoked before `rust_driver.create_cluster` and `rust_driver.delete_cluster`. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Mohammed Naser (mnaser)
left a comment
There was a problem hiding this comment.
Out of curiosity, why do we need boot_volume_size=0 -- what happens if it's not set? It seems/feels like it should just work out of the box, no?
…content guard Two independent fixes that surfaced while running the G2 bare-metal smoke test against PR #1015 + PR #1014. See #1015 for the full discussion; the README at notes/logs/ironic/mcapi-bm/README.md captures the end-to-end repro. F1 — `_decode_extra_files_label` accepts plain YAML/JSON in addition to base64 Operators using YAML manifests (Heat, Terraform, kubectl) typically have the platform handle base64 wrapping, so being forced to manually wrap the list a second time was a frequent foot-gun. The decoder now tries base64 first (the documented/recommended path) and falls back to plain YAML/JSON, with a single clear error message when neither parses. F2 — `_normalize_extra_file` rejects pre-encoded content that omits `encoding: base64` Previously, an entry with `content: <base64>` and no `encoding` field was silently re-encoded, landing on disk as a base64 string and breaking any later `bash <script>` step. A narrow heuristic now detects this case (canonical base64 charset, length >= 16, multiple of 4, decodes to valid UTF-8) and raises `InvalidParameterValue` with a hint pointing operators at the `encoding` field. Tests cover plain scripts, short strings, and multi-line content to confirm no false positives. (F3 — `fill_missing_labels_from_template` — was originally bundled here but has been split out into its own PR (#1016) against `main` so it can land independently of this PR's larger scope.) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Mohammed Naser (@mnaser) |
…content guard Two independent fixes that surfaced while running the G2 bare-metal smoke test against PR #1015 + PR #1014. See #1015 for the full discussion; the README at notes/logs/ironic/mcapi-bm/README.md captures the end-to-end repro. F1 — `_decode_extra_files_label` accepts plain YAML/JSON in addition to base64 Operators using YAML manifests (Heat, Terraform, kubectl) typically have the platform handle base64 wrapping, so being forced to manually wrap the list a second time was a frequent foot-gun. The decoder now tries base64 first (the documented/recommended path) and falls back to plain YAML/JSON, with a single clear error message when neither parses. F2 — `_normalize_extra_file` rejects pre-encoded content that omits `encoding: base64` Previously, an entry with `content: <base64>` and no `encoding` field was silently re-encoded, landing on disk as a base64 string and breaking any later `bash <script>` step. A narrow heuristic now detects this case (canonical base64 charset, length >= 16, multiple of 4, decodes to valid UTF-8) and raises `InvalidParameterValue` with a hint pointing operators at the `encoding` field. Tests cover plain scripts, short strings, and multi-line content to confirm no false positives. (F3 — `fill_missing_labels_from_template` — was originally bundled here but has been split out into its own PR (#1016) against `main` so it can land independently of this PR's larger scope.) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: ricolin <rlin@vexxhost.com>
|
recheck |
Test report (test env redacted)Scope: Env: Atmosphere AIO with 3-node BM emulator (libvirt + OVS PXE bridge), kube image
Overall: 5/5 executed phases PASS. Round-1 report archived locally with the full reproducer commands, libvirt domain XML, and ovs-vsctl wiring. Caveat: |
|
recheck |
When a cluster_template advertises `server_type=bm` (this PR), the
operator's intent is to provision both the master and worker nodes on
Ironic via Nova's ironic compute driver. However, Magnum does not
enforce that `master_flavor_id` / `flavor_id` actually resolve to an
Ironic-backed Nova flavor: a flavor with empty `extra_specs` (e.g.
the stock `m1.small`) is silently accepted and Nova schedules the
server on a libvirt/KVM hypervisor instead of an Ironic node. The
cluster reaches CREATE_COMPLETE and looks healthy, but every
server_type=bm code path added by this PR (boot_volume_size auto-zero,
network-interface=flat downstream, etc.) is bypassed. Operators
believe they are testing the BM control plane when they are not.
Reproduced on a baremetal AIO with this branch deployed:
$ openstack coe cluster create --cluster-template k8s-bm-gpu \\
--master-flavor m1.small --master-count 1 --node-count 0 \\
gap-master-vm
Request to create cluster ... accepted
$ openstack server show kube-...-master -c flavor -c hypervisor_hostname
flavor {'name': 'm1.small', ..., 'extra_specs': {}}
OS-EXT-SRV-ATTR:hypervisor_hostname instance # ← AIO host KVM, not Ironic
This commit adds `validate_baremetal_flavors()` and a small
`_is_baremetal_flavor()` helper. A flavor is considered Ironic-backed
when its extra-specs contain a `resources:CUSTOM_<NAME>=1` entry,
which is the convention Nova's Ironic driver requires for the
scheduler to route the request to the Ironic compute service.
`validate_cluster()` now calls the new helper, raising
`InvalidParameterValue` with an actionable message when either
flavor is missing the required extra-spec. The validator is a no-op
for `server_type=vm` templates, so the existing virtual-cluster path
is unaffected.
Unit tests cover:
* detection of the baremetal extra-spec (positive, negative, and
fallback to the cached `extra_specs` attribute when
`get_keys()` is not eagerly resolvable);
* no-op for vm templates;
* pass when both flavors are baremetal-backed;
* reject for a virtual master_flavor_id;
* reject for a virtual flavor_id.
$ pytest magnum_cluster_api/tests/unit/test_utils.py::TestValidateBaremetalFlavors -q
8 passed, 8 warnings in 0.03s
Signed-off-by: Rico Lin <rlin@vexxhost.com>
Follow-up: validate
|
Candidate upstream home: #1014. PR #1014 owns the server_type=bm path. Bare-metal tenant networks can have port security disabled so Ironic can use pre-created ports, but CAPO managed security groups cause generated OpenStackServer ports to include a security group. Neutron rejects that combination before Nova/Ironic can create the server. Add a ClusterClass variable that removes managedSecurityGroups only for server_type=bm clusters, keep VM clusters on the existing managed security group behavior, and cover the remove-patch path in the Rust test helper. Validation: - cargo test managed_security_groups - cargo test test_convert_values_to_cluster_topology_variables - uv run pytest magnum_cluster_api/tests/unit/test_sync.py magnum_cluster_api/tests/unit/test_utils.py magnum_cluster_api/tests/unit/test_resources.py magnum_cluster_api/tests/unit/test_driver.py -q Assisted-By: ChatGPT <noreply@openai.com> (cherry picked from commit 93e3406ff794f6e1bdd5cbee42f0ec61112691d5) (cherry picked from commit 86b807549c8a845c5e037e89ada46e1bd99e325c) Signed-off-by: Rico Lin <rlin@vexxhost.com>
Candidate upstream home: #1014. PR #1014 owns the server_type=bm path. Bare-metal tenant networks can have port security disabled so Ironic can use pre-created ports, but CAPO managed security groups cause generated OpenStackServer ports to include a security group. Neutron rejects that combination before Nova/Ironic can create the server. Add a ClusterClass variable that removes managedSecurityGroups only for server_type=bm clusters, keep VM clusters on the existing managed security group behavior, and cover the remove-patch path in the Rust test helper. Validation: - cargo test managed_security_groups - cargo test test_convert_values_to_cluster_topology_variables - uv run pytest magnum_cluster_api/tests/unit/test_sync.py magnum_cluster_api/tests/unit/test_utils.py magnum_cluster_api/tests/unit/test_resources.py magnum_cluster_api/tests/unit/test_driver.py -q Assisted-By: ChatGPT <noreply@openai.com> (cherry picked from commit 93e3406ff794f6e1bdd5cbee42f0ec61112691d5) Signed-off-by: ricolin <rlin@vexxhost.com> (cherry picked from commit 86b807549c8a845c5e037e89ada46e1bd99e325c)
2b44663 to
91abc99
Compare
Problem
Magnum's cluster_api drivers only advertised
server_type=vm. Baremetal templates using Ironic through Nova failed driver lookup, and baremetal boot also needed safer defaults around root volumes and networking.Solution
This PR now carries the
server_type=bmbehavior as three focused pieces:feat(driver): advertise server_type=bm alongside vmserver_type=bmentries for the shipped cluster_api drivers.feat(resources): default boot_volume_size=0 for server_type=bmfix(pr1014): disable CAPO managed security groups for bmdisableManagedSecurityGroupsand removes CAPOmanagedSecurityGroupsonly forserver_type=bmClusterClasses.PortSecurityAndIPRequiredForSecurityGroupson the disabled-port-security baremetal VLAN network, before Nova/Ironic server create.VM clusters keep the existing CAPO managed-security-group behavior.
Validation
Latest scoped validation after moving Issue 41 into this PR:
cargo test managed_security_groups-> 2 passedcargo test test_convert_values_to_cluster_topology_variables-> 1 passeduv run pytest magnum_cluster_api/tests/unit/test_resources.py::test_cluster_variables_disable_managed_security_groups_for_bm -q-> 1 passed, 8 warningsuv run ruff check magnum_cluster_api/resources.py magnum_cluster_api/tests/unit/test_resources.py-> passedrustfmt --check src/features/managed_security_groups.rs-> passedReview Notes
The managed-security-groups change is deliberately scoped to BM. The live failure came from a BM VLAN network with
port_security_enabled=False; VM tenant-network behavior should remain unchanged.