Skip to content

Feature/add slo metrics to flowcontrol#2685

Open
loicmarchal wants to merge 9 commits intokubernetes-sigs:mainfrom
loicmarchal:feature/add-slo-metrics-to-flowcontrol
Open

Feature/add slo metrics to flowcontrol#2685
loicmarchal wants to merge 9 commits intokubernetes-sigs:mainfrom
loicmarchal:feature/add-slo-metrics-to-flowcontrol

Conversation

@loicmarchal
Copy link
Copy Markdown
Contributor

@loicmarchal loicmarchal commented Mar 24, 2026

What type of PR is this?
/kind feature

What this PR does / why we need it:
Adds the following metrics:

  • request duration per SLO TTFT class
  • adds SLO TTFT class to queue size and queue bytes metrics
  • counter for incoming requests per SLO class.

The TTFT SLO is passed with the request in the headers (x-slo-ttft-ms).

These metrics will allow tracking the percentage of requests with a duration below or above their associated SLO class as well as the number of incoming and queued requests per SLO class.

The SLO classes might need to have smaller intervals (currently the interval is 200 ms), but it would increase cardinality of the Prometheus histogram

Which issue(s) this PR fixes:
Fixes #2532

Does this PR introduce a user-facing change?:

NONE

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. kind/feature Categorizes issue or PR as related to a new feature. labels Mar 24, 2026
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: loicmarchal
Once this PR has been reviewed and has the lgtm label, please assign ahg-g for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Mar 24, 2026
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

Hi @loicmarchal. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work.

Tip

We noticed you've done this a few times! Consider joining the org to skip this step and gain /lgtm and other bot rights. We recommend asking approvers on your previous PRs to sponsor you.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Mar 24, 2026
@netlify
Copy link
Copy Markdown

netlify bot commented Mar 24, 2026

Deploy Preview for gateway-api-inference-extension ready!

Name Link
🔨 Latest commit bf548d6
🔍 Latest deploy log https://app.netlify.com/projects/gateway-api-inference-extension/deploys/69cd9a2178803d0008ac1b4a
😎 Deploy Preview https://deploy-preview-2685--gateway-api-inference-extension.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@loicmarchal loicmarchal marked this pull request as ready for review March 26, 2026 03:50
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 26, 2026
}

// SLO class constants label for flow control SLO metrics (bounded buckets for the TTFT SLO header in ms).
const (
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if we can/should hardcode these buckets. TTFT is pretty heavily reliant on what the prompt looks like, and a customer could want higher range buckets.

flowKey := req.FlowKey()
priority := strconv.Itoa(flowKey.Priority)
reqBytes := req.ByteSize()
sloClass := metrics.ClassifySLO(extractHeader(req, fwkrequest.TTFTSLOMsHeaderKey))
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm feeling hesitant about adding this use-case specific bucketing/labeling (see other comment).

We want to let the user define the buckets and associate it that way. A more robust solution would be adding to the InferenceObjective object, and making the inf objective string the label.

Which would probably be worth a longer discussion.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @kfswain! I agree.

I was feeling uncomfortable with the choices of interval (as mentioned in the PR description) because I felt it restricts the possibilities. It's a good indication it should be defined by the user.

It's definitely better to add it to the InferenceObjective (as SLO attainment is already a planned expansion). How do you think we shoud add it? One label per InferenceObjective? And as many InferenceObjectives as we need, to represent all the buckets? Would it be correlated to the priority? Thank you!

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 1, 2026
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 1, 2026
@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 3, 2026
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

PR needs rebase.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Flow Control] Add SLO values to flow control metrics

3 participants