Skip to content

feat: Automated UFM Certificate Rotation Support in NCX Infra Controller (carbide-api) #830

@hasayesh

Description

@hasayesh

Is this a new feature, an enhancement, or a change to existing functionality?

New Feature

How would you describe the priority of this feature request

High

Please provide a clear description of problem this feature solves

The NCX Infra Controller site API (carbide-api) supports two authentication
methods for UFM communication: token-based and mTLS. mTLS is the preferred
method and often required by site security policies in production deployments. However, the UFM server certificates are
issued by Vault PKI (forgeca/roles/forge-cluster) and there is currently no
mechanism for automated certificate rotation on the UFM side.

Today, generating and delivering UFM certificates is an entirely manual process:

  1. Operator runs carbide-admin-cli credential generate-ufm-cert --fabric=default
  2. Carbide-API calls Vault PKI and writes cert files to its own pod's ephemeral
    filesystem (/var/run/secrets/)
  3. Operator must kubectl cp the files out, transfer them to the UFM host, and
    restart UFM

These files are ephemeral and lost on pod restart. These files are not accessible by external
systems, and there is no HTTP interface to retrieve them programmatically.

This means any external system that needs fresh certificates from the Forge CA
(such as UFM's) has no way to obtain them.

Feature Description

As a fabric manager integration, carbide-api shall expose HTTP endpoints that
serve Vault PKI-issued certificates on demand, enabling external systems to
programmatically retrieve fresh certificates without manual intervention.

Specifically, add two GET endpoints to carbide-api:

Endpoint Returns
GET /api/v1/ufm/{fabric}/certs/ca CA/intermediate certificate (PEM)
GET /api/v1/ufm/{fabric}/certs/server Server certificate + private key (PEM)

These endpoints:

  • Call the existing CertificateProvider::get_certificate() (Vault PKI) —
    the same code path that write_ufm_certs() already uses.
  • Return PEM data directly in the HTTP response instead of writing to local disk.
  • Authenticated via the existing mTLS layer — no new authentication code needed.
    The caller must present a valid client cert signed by the Forge CA.
  • Are mounted at the top-level router (outside /admin) so they bypass the web
    UI session auth, but are still protected by the TLS handshake.

Describe your ideal solution

Provide a supported interface that allows UFM (and similar systems) to programmatically obtain and periodically rotate their mTLS certificates from Vault PKI (Forge CA), eliminating the need for manual kubectl cp operations or host-level file transfer.

Describe any alternatives you have considered

1:Increase Vault max_ttl to reduce rotation frequency
Issues: Does not eliminate manual rotation, just makes it less frequent. Does not scale across many sites
2:K8s CronJob to generate + push certs via SSH/SCP
issues: Requires SSH access from K8s to UFM host. Adds operational complexity and security surface
3:Carbide-API pushes certs to UFM via REST API
issues: Carbide doesn't know when UFM's cert is expiring. Requires storing UFM admin credentials. Overloads existing code path

Additional context

This feature enables automated certificate rotation for UFM, making the
transition from token-based to mTLS authentication practical at scale.
Short-lived certificates (TTL 30 days) with automatic renewal are more secure than
long-lived ones, and this approach eliminates the manual rotation burden
that currently makes adoption impractical. In addition, same cert-serving pattern can be
reused for other fabric manager integrations as they adopt certificate-based
authentication.

Code of Conduct

  • I agree to follow NCX Infra Controller's Code of Conduct
  • I have searched the open feature requests and have found no duplicates for this feature request

Metadata

Metadata

Assignees

Labels

featureFeature (deprecated - use issue type, but it's needed for reporting now)

Projects

Status

Backlog

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions