Skip to content

Update cpu resource metrics to handle resize#3559

Open
jmmcorreia wants to merge 9 commits intoopen-telemetry:mainfrom
jmmcorreia:k8s_container_cpu_metrics
Open

Update cpu resource metrics to handle resize#3559
jmmcorreia wants to merge 9 commits intoopen-telemetry:mainfrom
jmmcorreia:k8s_container_cpu_metrics

Conversation

@jmmcorreia
Copy link
Copy Markdown

Related to #3558

Changes

Modifies the k8s container limit and request metrics to align with resize of CPU metrics as documented in https://kubernetes.io/docs/tasks/configure-pod-container/resize-container-resources/

It fixes the CPU part related to issue #3558. Memory metrics can be fixed in a follow-up PR based on the comments provided for the CPU section

For more details check issue

Merge requirement checklist #3558

  • CONTRIBUTING.md guidelines followed.
  • Change log entry added, according to the guidelines in When to add a changelog entry.
    • If your PR does not need a change log, start the PR title with [chore]
  • Links to the prototypes or existing instrumentations (when adding or changing conventions)

@ChrsMark ChrsMark moved this to In Review in K8s SemConv SIG Mar 19, 2026
@ChrsMark ChrsMark moved this from Untriaged to Awaiting codeowners approval in Semantic Conventions Triage Mar 19, 2026
Copy link
Copy Markdown
Member

@ChrsMark ChrsMark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank's for digging into this @jmmcorreia! I left some comments but overall I think it goes to right direction!

@open-telemetry/semconv-k8s-approvers PTAL

- id: metric.k8s.container.cpu.limit.actual
type: metric
metric_name: k8s.container.cpu.request
metric_name: k8s.container.cpu.limit.actual
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No strong preference here but maybe an alternative could be k8s.container.cpu.limit.current. @open-telemetry/semconv-k8s-approvers @dashpole thoughts?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for reference, I did not really had any strong preference, so decided to align the naming scheme with the k8s doc i.e. Key Concepts under https://kubernetes.io/docs/tasks/configure-pod-container/resize-container-resources/

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see current or actual used much in other conventions, so I think we are fine to stick with what is closest to the k8s docs.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about this more, I think I do prefer current. desired and actual make sense when comparing the two (e.g. writing an alert if they differ from each other for a long time), but don't make as much sense when viewing actual in isolation.

E.g. if k8s allowed updating the container image in-place, it would look odd to me to see k8s.container.image.actual, but k8s.container.image.current makes sense.

If we did decide to make the desired request/limit opt-in, I think this would be especially important.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think your example makes sense. I also took some time to also check what existing metrics seem to be doing and current seems to be the standard. The following k8s metrics are also defined k8s.hpa.pod.current and k8s.hpa.pod.desired, which follow the proposal here. Finally as per the k8s definitions, actual is the currently configured.

I think you make a valid point. Will make the change from actual to current.

- k8s.container
note: |
The value range is [0.0,1.0]. A value of 1.0 means the container is using 100% of its actual CPU request.
If the CPU request is not set, this metric SHOULD NOT be emitted for that container.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A question for my own understanding: if the request is not set on the container level by the user but is set on Pod level, will k8s automatically apply this down to all containers or otherwise how the container level resources are affected by the Pod level defined resources?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For a detailed view, you can check the tables under https://github.com/kubernetes/enhancements/blob/master/keps/sig-node/2837-pod-level-resource-spec/README.md#comprehensive-tabular-view as that cover all the possible cases.

So if your question is if by defining pod limits/requests, k8s will also set the container limits/requests in the pod spec then the answer is no. In that case, the container values will remain empty. I also checked on the API to response to see if container actual/current limits would be returned but that does not seem to be the case.

Pod resource limits can be set as the container cgroup max value when no container limit is set, mostly to ensure runtimes such as jvm can read that value and adapt accordingly.

For all other cases, if a value is missing, then the container cgroup value will use the default available. -> My assumption is that basically containers are allowed to use the pod's available resources freely until they are exhausted. Once that happens, enforcement happens in the pod cgroup, which then impacts the child cgroups.

jmmcorreia and others added 2 commits March 19, 2026 17:35
Co-authored-by: Christos Markou <chrismarkou92@gmail.com>
@dashpole
Copy link
Copy Markdown
Contributor

dashpole commented Mar 24, 2026

I have a few hesitations with this:

  • As a user, it isn't nearly as obvious what I should be comparing my CPU and memory usage to. The desired or the actual requests/limits? (answer: actual)
  • Desired requests/limits are largely going to be a niche metric to diagnose in-place update issues. We could even consider making it opt-in, IMO.

Did we consider something more like k8s.container.cpu.limit vs k8s.container.cpu.desired_limit?

@jmmcorreia
Copy link
Copy Markdown
Author

I have a few hesitations with this:

  • As a user, it isn't nearly as obvious what I should be comparing my CPU and memory usage to. The desired or the actual requests/limits? (answer: actual)

Agreed. The utilization metrics does attempt to make that easier on users by making comparison on the collector side and providing the presentation metric to them.

  • Desired requests/limits are largely going to be a niche metric to diagnose in-place update issues. We could even consider making it opt-in, IMO.

That is a valid point, I think it makes sense. Probably a lot of users might not attempt to patch the container resources whilst it is running.

Did we consider something more like k8s.container.cpu.limit vs k8s.container.cpu.desired_limit?

Before raising the PR I was stuck on deciding between the implemented approach (i.e. limit.actual and limit.desired) or going with what you proposed. Had a short discussion with @ChrsMark, and going with limit.actual and limit.desired seemed to fit more the semantic conventions approach of creating a namespace for values that fall under a sort of similar category (i.e. going with . instead of _). What I link from your proposal is that I believe it makes things somewhat simpler for users. We give a clear definition on limit and then only the desired limit (that we are making opt-in anyway) uses the _ separator in the name. However, the current approach was picked as it seems to containerize values a little better as it also includes utilization under the limit. Most likely there will be no need to add more values related to the limit, but if it were, the current approach also make that more flexible. IMO both could work, I think I slightly prefer the current approach given the better containerization of the values and I believe this should help prevent any changes to the metrics in the future in case any new ones have to be added (even if not that likely).

@dashpole But happy yo hear your thoughts on this. @ChrsMark if you have any other comments or anything to add I guess that could be helpful.

@dashpole
Copy link
Copy Markdown
Contributor

The point about utilization is a good one. I think maybe switching to current from actual will mostly address my concerns. I reopened the comment thread above.

| `k8s.container.ready` (type: `gauge`) | `k8s.container.ready` (type: `updowncounter`) |
| `k8s.container.cpu_limit_utilization` (type: `gauge`) | `k8s.container.cpu.limit_utilization` (type: `gauge`) |
| `k8s.container.cpu_request_utilization` (type: `gauge`) | `k8s.container.cpu.request_utilization` (type: `gauge`) |
| `k8s.container.cpu_limit` (type: `gauge`) | `k8s.container.cpu.limit.desired`, `k8s.container.cpu.limit.current` (type: `updowncounter`) |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another option to consider: we keep the existing k8s.container.cpu.limit (it maps to spec.containers[*].resources) and introduce an additive metric like k8s.container.cpu.effective_limit sourced from status.containerStatuses[*].resources. When resize isn't in use or the cluster doesn't support it, implementors can simply not emit the effective metric - there is no migration burden for existing users, and the new metric is only relevant when there's actually a difference to observe.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dashpole had a very similar suggestion earlier in the thread and I also had discussed this with @ChrsMark.

Going with the current approach allow to neatly group desired, actual, and utilization metrics under limit and request. Moreover, going with this seems to be somewhat more aligned with the more recent semantic conventions, where _ is less used.

Also, imo having limit and request as namespaces could potentially help easily add more values in the future if needed without impacting other metrics (instead of what is happening now).

Finally, I have marked this item as an enhancement instead of breaking changes mostly because none of this metrics seems to be used anywhere, at least not in the latest format. For example, k8scluster receiver still uses k8s.container.cpu_limit and kubeletstats receiver uses k8s.container.cpu_limit_utilization: for instance. Hence, if there is a moment to making a more drastic but potentially future proof change, I would say now is the time, since these have not been adopted yet.

Personally I prefer to keep the metrics in the current proposed format. But happy to hear your thoughts on this take. The same thing will have to be done for container memory and then pod cpu and memory. Hence, it would be nice we can close on a design we all agree on.

Copy link
Copy Markdown
Contributor

@jinja2 jinja2 Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the additional context, the desired/current split works for me.

One last concern - I believe the new status field used for current metric is gated on InPlacePodVerticalScaling and only populated when the feature gate is enabled (beta/on-by-default in 1.33, GA in 1.35). On older clusters the "recommended" current metric can't be emitted while the "opt-in" desired can. I guess implementation can fallback to spec but might be worth adding a note about this and the K8s version requirement in the metric description. I don't think we've had a discussion about how to handle K8s version requirements in semconv before.

Also, just an observation, K8s 1.33+ exposes PodResizePending and PodResizeInProgress as pod conditions. Implementing this open issue for a k8s.pod.condition.status metric would round off the resize observability usecase nicely, letting users correlate desired/current divergence with the actual resize state.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is a great point. I aligned this proposal with the oldest active version, which is 1.33 (See eol). If that is deemed enough (i.e. only support current active versions), then no special behavior needs to be accounted for at the container level (but it might have to be done for pods, since this feature is more recent there).

We could also document that, for older k8s releases, the only workaround could be to enable the "opt-in" desired metric. It puts more burden on customer side, but since they would be running a non active k8s release, I wonder if that could be an acceptable trade-off.

Copy link
Copy Markdown
Member

@ChrsMark ChrsMark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank's @jmmcorreia!

@lmolkova lmolkova moved this from Awaiting codeowners approval to Needs More Approval in Semantic Conventions Triage Apr 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:k8s enhancement New feature or request

Projects

Status: In Review
Status: Needs More Approval

Development

Successfully merging this pull request may close these issues.

5 participants