Weight-class configuration for LeastWeighted target allocation strategy#4704
Weight-class configuration for LeastWeighted target allocation strategy#4704suprjinx wants to merge 24 commits intoopen-telemetry:mainfrom
Conversation
Signed-off-by: Geoff Wilson <geoff@gr-oss.io>
| tItem.JobName, | ||
| tItem.TargetURL, | ||
| tItem.Labels, | ||
| builder.Labels(), |
There was a problem hiding this comment.
this is a significant change -- the filtered labels were not persisted to the item previously
There was a problem hiding this comment.
I understand the goal of this PR, but this change could indeed be decently dangerous. I have to spend some time thinking through the consequences of this. cc @swiatekm
There was a problem hiding this comment.
There's no way we can do this, it's a major breaking change. My original thought was that we were going to use a Pod annotation, because we get those straight from K8s service discovery, and don't need relabeling.
There was a problem hiding this comment.
There's no way we can do this, it's a major breaking change. My original thought was that we were going to use a Pod annotation, because we get those straight from K8s service discovery, and don't need relabeling.
Thanks @swiatekm @jaronoff97 -- I will remove the relabeling. I can use pod annotation, but i think it will also require some new config option for TA (in order to supply a weight config for Pods we don't control)
There was a problem hiding this comment.
Ready for another look, with TA CR changes for WeightOverrides
E2E Test Results 34 files ±0 247 suites ±0 3h 0m 45s ⏱️ + 43m 44s For more details on these failures, see this check. Results for commit d13f23b. ± Comparison against base commit ebb6ab9. ♻️ This comment has been updated with latest results. |
defaults. Also now uses "__target_allocation_weight". Signed-off-by: Geoff Wilson <geoff@gr-oss.io>
Signed-off-by: Geoff Wilson <geoff@gr-oss.io>
Signed-off-by: Geoff Wilson <geoff@gr-oss.io>
…uprjinx/opentelemetry-operator into target-allocation-weight-classes
Signed-off-by: Geoff Wilson <geoff@gr-oss.io>
|
The changes in this PR look broadly correct, but do more than I'm comfortable with without further discussion about the API and how this feature should work in the long term. Originally, the problem we wanted solved was that jobs of a particular class should be evenly spread amongst collectors. The simple solution I envisioned for this was to set the class as a Pod annotation or label and let the strategy use that information, the same way it currently does with the job name. Here, we're now adding CR-level overrides and tracking weight as if it's a meaningful number. I'm not completely against doing that, but it requires a more careful design, especially if we're going to make it configurable at the CR level. @suprjinx do you think we can cut this down to a more basic version while still keeping it useful for you? |
@swiatekm so you'd suggest consuming an annotation, configured in podSpec, but no ability to configure weight for |
If what you're actually trying to do is balance based on a statically assigned numeric load, so whichever collector gets kube-state-metrics doesn't get a lot of other targets, then I'm not sure we want to do that. It's a fairly specific use case and something of an anti-pattern for Prometheus in general - you should try to avoid having massive targets. I don't think an unreasonable suggestion would be to either run a separate collector instance just for kube-state-metrics or shard it, either as a StatefulSet or Daemonset. Is that not an option for you? |
just expect a weight int in Pod annotation Signed-off-by: Geoff Wilson <geoff@gr-oss.io>
@swiatekm I removed all the API changes so this is now just an optional annotation. Also, now expecting a numeric weight (1-100) rather than a class name. |
Signed-off-by: Geoff Wilson <geoff@gr-oss.io>
Signed-off-by: Geoff Wilson <geoff@gr-oss.io>
Signed-off-by: Geoff Wilson <geoff@gr-oss.io>
Signed-off-by: Geoff Wilson <geoff@gr-oss.io>
…enly Signed-off-by: Geoff Wilson <geoff@gr-oss.io>
… "weight" Signed-off-by: Geoff Wilson <geoff@gr-oss.io>
…uprjinx/opentelemetry-operator into target-allocation-weight-classes
…uprjinx/opentelemetry-operator into target-allocation-weight-classes
Signed-off-by: Geoff Wilson <geoff@gr-oss.io>
Description:
This PR implements weight-based target allocation for the
least-weightedstrategy, so that heavier targets are spread across collectors rather than piling up on one. In the absence of weight information, the strategy falls back to existingNumTargetsbalancing.How to assign weights to targets
Add the
opentelemetry.io/target-allocation-weightannotation to the target pod with an integer value from 1 to 100:For pods you don't control (e.g. kube-state-metrics, nginx-ingress), you can patch the deployment's pod template:
kubectl patch deployment nginx-ingress -p '{"spec":{"template":{"metadata":{"annotations":{"opentelemetry.io/target-allocation-weight":"10"}}}}}'Kubernetes SD automatically surfaces this as the meta label
__meta_kubernetes_pod_annotation_opentelemetry_io_target_allocation_weight, which the allocator reads directly.Weight resolution
Values outside the 1-100 range, non-numeric, or missing annotations all fall back to the default weight of 1.
How it works
The
least-weightedstrategy tracksWeightedLoad(sum of weights) per collector instead of just target count. When assigning a new target, it picks the collector with the lowestWeightedLoad. This ensures heavy targets get spread across collectors.Link to tracking Issue(s):
Testing:
Unit tests covering: weighted load balancing, unlabeled targets defaulting to weight 1, invalid annotations (non-numeric, zero, negative, >100), and balanced distribution of heavy targets.
Documentation:
Needed
PR Assisted by Claude Code