Skip to content

fix(mlbs): migrate AffinityTags to use pod labels when inbound tags are disabled#16030

Open
mail2sudheerobbu-oss wants to merge 32 commits intokumahq:masterfrom
mail2sudheerobbu-oss:fix/15995-affinity-tags-use-labels
Open

fix(mlbs): migrate AffinityTags to use pod labels when inbound tags are disabled#16030
mail2sudheerobbu-oss wants to merge 32 commits intokumahq:masterfrom
mail2sudheerobbu-oss:fix/15995-affinity-tags-use-labels

Conversation

@mail2sudheerobbu-oss
Copy link
Copy Markdown

Motivation

When KUMA_EXPERIMENTAL_INBOUND_TAGS_DISABLED=true is set, Kuma strips inbound tags from Dataplane resources to reduce memory overhead. However, MeshLoadBalancingStrategy.LocalityAwareness.LocalZone.AffinityTags relies on inbound tags to group and route traffic. This means that when inbound tags are disabled:

  1. The local proxy cannot determine its own affinity group (tags are absent from its own Dataplane resource).
  2. Remote endpoints cannot be matched for affinity routing because their filter metadata is empty (no tags were propagated).

This results in locality-aware load balancing via AffinityTags silently failing when KUMA_EXPERIMENTAL_INBOUND_TAGS_DISABLED is enabled.

Pod labels, however, are always available from Kubernetes pod metadata regardless of this flag. This PR migrates AffinityTags handling to fall back to pod labels when inbound tags are absent.

Closes #15995

Implementation information

The fix spans the full data pipeline from endpoint construction through to locality group resolution:

  1. pkg/core/xds/types.go — Added a Labels map[string]string field to the Endpoint struct to carry pod/workload labels alongside inbound tags throughout the routing pipeline.

  2. pkg/xds/envoy/metadata/v3/metadata.go — Added a new LbLabelsKey constant ("io.kuma.labels"), a EndpointMetadataWithLabels function that stores both inbound tags (under envoy.lb) and pod labels (under io.kuma.labels) in Envoy filter metadata, and an ExtractLbLabels function to retrieve pod labels from metadata.

  3. pkg/xds/envoy/endpoints/v3/endpoints.go — Updated ToLocalityLbEndpoints to call EndpointMetadataWithLabels instead of EndpointMetadata, so pod labels are embedded in each endpoint's Envoy filter metadata.

  4. pkg/xds/topology/outbound.go — Populated the new Labels field in fillDataplaneOutbounds and fillLocalMeshServices from GetMeta().GetLabels() on the respective Dataplane/DPP resource.

  5. pkg/plugins/policies/meshloadbalancingstrategy/plugin/v1alpha1/locality_aware.go — Extracted pod labels from endpoint filter metadata in createEndpoint, and added a label-based fallback in configureLocalZoneEndpointLocality when the tag key is not found in endpoint.Tags.

  6. pkg/plugins/policies/meshloadbalancingstrategy/plugin/v1alpha1/priority.go — Added a pod labels parameter to GetLocalityGroups and getLocalLbGroups, with a fallback to podLabels[tag.Key] when inboundTags.Values(tag.Key) is empty.

  7. pkg/plugins/policies/meshloadbalancingstrategy/plugin/v1alpha1/plugin.go — Threaded podLabels (from proxy.Dataplane.GetMeta().GetLabels()) through all call sites of claConfigurer and staticCLAConfigurer for DPP and gateway resources. Egress passes nil since no local pod labels are available in that context.

The approach is backward-compatible: when inbound tags are present, they continue to be used as before. The label fallback only activates when the tag key is absent.

Supporting documentation

Signed-off-by: mail2sudheerobbu-oss <mail2sudheerobbu@gmail.com>
…r pod label support

Signed-off-by: mail2sudheerobbu-oss <mail2sudheerobbu@gmail.com>
… in lb metadata

Signed-off-by: mail2sudheerobbu-oss <mail2sudheerobbu@gmail.com>
Signed-off-by: mail2sudheerobbu-oss <mail2sudheerobbu@gmail.com>
…to labels in affinity matching

Signed-off-by: mail2sudheerobbu-oss <mail2sudheerobbu@gmail.com>
…o labels for AffinityTags

Signed-off-by: mail2sudheerobbu-oss <mail2sudheerobbu@gmail.com>
…urer to NewEndpoints

Signed-off-by: mail2sudheerobbu-oss <mail2sudheerobbu@gmail.com>
@github-actions
Copy link
Copy Markdown
Contributor

Reviewer Checklist

🔍 Each of these sections need to be checked by the reviewer of the PR 🔍:
If something doesn't apply please check the box and add a justification if the reason is non obvious.

  • Is the PR title satisfactory? Is this part of a larger feature and should be grouped using > Changelog?
  • PR description is clear and complete. It Links to relevant issue as well as docs and UI issues
  • This will not break child repos: it doesn't hardcode values (.e.g "kumahq" as an image registry)
  • IPv6 is taken into account (.e.g: no string concatenation of host port)
  • Tests (Unit test, E2E tests, manual test on universal and k8s)
    • Don't forget ci/ labels to run additional/fewer tests
  • Does this contain a change that needs to be notified to users? In this case, UPGRADE.md should be updated.
  • Does it need to be backported according to the backporting policy? (this GH action will add "backport" label based on these file globs, if you want to prevent it from adding the "backport" label use no-backport-autolabel label)

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes MeshLoadBalancingStrategy.LocalityAwareness.LocalZone.AffinityTags when KUMA_EXPERIMENTAL_INBOUND_TAGS_DISABLED=true by propagating pod/workload labels through the endpoint pipeline and using them as a fallback signal when inbound tags are missing.

Changes:

  • Extend core_xds.Endpoint to carry Labels, populate them from resource metadata, and serialize them into Envoy endpoint filter metadata.
  • Update locality-aware logic to extract labels from endpoint metadata and fall back to labels when inbound tags are absent.
  • Thread local pod labels through the MLBS plugin’s CLA configurers so local locality-group resolution can fall back to labels.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
pkg/core/xds/types.go Adds Endpoint.Labels field to carry workload labels through xDS routing structures.
pkg/xds/topology/outbound.go Populates Endpoint.Labels from dataplane/DPP resource metadata labels when building outbound endpoints.
pkg/xds/envoy/metadata/v3/metadata.go Introduces io.kuma.labels metadata key, plus helpers to write/read labels in endpoint filter metadata.
pkg/xds/envoy/endpoints/v3/endpoints.go Switches endpoint metadata generation to include labels (via EndpointMetadataWithLabels).
pkg/plugins/policies/meshloadbalancingstrategy/plugin/v1alpha1/locality_aware.go Extracts labels from endpoint metadata and uses them as a fallback for local-zone affinity matching.
pkg/plugins/policies/meshloadbalancingstrategy/plugin/v1alpha1/priority.go Adds local pod-label fallback when deriving locality groups from affinity tag keys.
pkg/plugins/policies/meshloadbalancingstrategy/plugin/v1alpha1/plugin.go Threads podLabels through configurer call sites (DPP/gateway; nil for egress).

@mail2sudheerobbu-oss
Copy link
Copy Markdown
Author

Hey @Automaat @bartsmykla — could you take a look at this PR when you get a chance? The branch is up to date with master and ready for review. Thanks! 🙏

@mail2sudheerobbu-oss
Copy link
Copy Markdown
Author

Thanks @Automaat for the thorough review! All five points are well-taken. Here's how I'll address each:

1. Filter pod labels to AffinityTag-referenced keys only — Agree, serializing all pod labels is wasteful. I'll thread the active AffinityTag keys down into EndpointMetadataWithLabels and filter labels to only those keys before encoding into xDS metadata. This bounds the metadata growth to O(endpoints × affinityTagCount) instead of O(endpoints × allLabels).

2. Add debug logging on label fallback — Will add a log.Debug (or equivalent Kuma logger) at the point where inboundTags.Values(key) is empty and we fall back to podLabels[key], so operators can observe when and for which endpoints the fallback is active.

3. Fix ExtractLbLabels to return nil when key absent — Will change the early return to return nil instead of return tags.Tags{}, matching ExtractLbTags behavior and eliminating the allocation on the hot path.

4. Consolidate fallback logic into resolveAffinityValue — Will extract a shared helper func resolveAffinityValue(inboundTags tags.Tags, podLabels map[string]string, key string) string and replace both the priority.go and locality_aware.go call sites with it.

5. Golden-file integration test for the full pipeline — Will add a golden-file test that wires InboundTagsDisabled=true through the full chain: endpoint with labels → xDS metadata serialization → MLBS locality grouping. This validates the entire contract in one place.

I'll push a follow-up commit addressing items 1–4 (code changes) and then item 5 (golden test) separately for easier review.

… absent

Refactor ExtractLbLabels to return nil for nil metadata and structVal.

Signed-off-by: mail2sudheerobbu-oss <mail2sudheerobbu@gmail.com>
…tag/label fallback in priority.go

Refactor affinity value resolution to use a dedicated function.

Signed-off-by: mail2sudheerobbu-oss <mail2sudheerobbu@gmail.com>
…stent tag/label lookup

Refactor locality-aware endpoint configuration to use a helper function for resolving affinity values from tags and labels.

Signed-off-by: mail2sudheerobbu-oss <mail2sudheerobbu@gmail.com>
…in priority.go

Add logging for missing affinity tags fallback to pod labels.

Signed-off-by: mail2sudheerobbu-oss <mail2sudheerobbu@gmail.com>
…ates in locality_aware.go

Signed-off-by: mail2sudheerobbu-oss <mail2sudheerobbu@gmail.com>
…priority.go

Add affinityTagPodLabels function to filter pod labels based on AffinityTags configuration.

Signed-off-by: mail2sudheerobbu-oss <mail2sudheerobbu@gmail.com>
…n locality_aware.go

Add pod labels based on configuration for locality-aware balancing.

Signed-off-by: mail2sudheerobbu-oss <mail2sudheerobbu@gmail.com>
Added a new test case for locality-aware inbound tags disabled scenario, including multiple backend and payment resources with specific configurations and policies.

Signed-off-by: mail2sudheerobbu-oss <mail2sudheerobbu@gmail.com>
Add listeners for backend and payments services with appropriate configurations.

Signed-off-by: mail2sudheerobbu-oss <mail2sudheerobbu@gmail.com>
Signed-off-by: mail2sudheerobbu-oss <mail2sudheerobbu@gmail.com>
Signed-off-by: mail2sudheerobbu-oss <mail2sudheerobbu@gmail.com>
Add comments to clarify label filtering process.

Signed-off-by: mail2sudheerobbu-oss <mail2sudheerobbu@gmail.com>
@mail2sudheerobbu-oss
Copy link
Copy Markdown
Author

Hi @Automaat @bartsmykla — all 5 of your review comments have now been fully addressed:

  1. Label inflation (metadata.go): Filtered endpoint labels to only AffinityTag keys in NewEndpoints before calling ToLocalityLbEndpoints, reusing the existing affinityTagPodLabels helper.
  2. Debug logging (priority.go): Added log.V(1).Info(...) at both fallback sites in resolveAffinityValues so silent label-fallback is now observable.
  3. Golden-file test (priority_test.go): Added locality_aware_inbound_tags_disabled table-driven entry with three golden files covering the full pipeline with KUMA_EXPERIMENTAL_INBOUND_TAGS_DISABLED=true.
  4. ExtractLbLabels nil return (metadata.go): Refactored to return nil (consistent with ExtractLbTags) instead of allocating an empty tags.Tags{} on the hot path.
  5. Duplicated fallback logic (priority.go): Extracted resolveAffinityValues shared helper — single canonical "prefer tag, fall back to label" implementation used in both priority.go and locality_aware.go.

The branch is up to date with master, all inline replies posted, and CI is green. Would love an approval or any further feedback when you get a chance! 🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

MeshLoadBalancingStrategy migrate AffinityTags to use labels instead of inbound tags

3 participants