feat(datalayer): add metricsPort parameter to metrics-data-source plugin#2762
feat(datalayer): add metricsPort parameter to metrics-data-source plugin#2762janghyukjin wants to merge 1 commit intokubernetes-sigs:mainfrom
Conversation
✅ Deploy Preview for gateway-api-inference-extension ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
|
|
|
Welcome @janghyukjin! |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: janghyukjin The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
Hi @janghyukjin. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Regular contributors should join the org to skip this step. Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
a88f1c3 to
c99f3f8
Compare
Adds MetricsPort to metricsDatasourceParams so the metrics-data-source plugin can scrape a port different from the inference port. When metricsPort is non-zero in EndpointPickerConfig plugin parameters, it overrides the port derived from the endpoint's MetricsHost; zero (default) preserves existing behavior. This restores functionality that was available via --model-server-metrics-port but lost when that flag was removed in kubernetes-sigs#2441 without an equivalent plugin config parameter.
c99f3f8 to
ba507e6
Compare
|
@janghyukjin thanks for the PR. The metrics port was intentionally dropped from CLI and configuration. See #1886. |
|
@elevran Thanks for the context! There are practical scenarios where separating inference and metrics ports is necessary. A common case is model servers that expose metrics on a dedicated port, where metrics are intentionally decoupled from inference. In Istio mTLS STRICT mode, EPP's direct pod-IP scraping bypasses Service-level configuration and results in passthrough traffic without mTLS upgrade, leading to failed scrapes. Using a separate metrics port allows operators to control interception or policy independently without affecting inference. This PR does not change default behavior and only restores the ability to configure a separate metrics port, which was previously supported via --model-server-metrics-port but is currently not expressible via EndpointPickerConfig. Keeping the configuration scoped to the metrics-data-source plugin rather than InferencePool.spec, as this is an EPP scraping concern rather than a pool-level attribute. Open to feedback if the preferred direction differs. |
|
The main concern I have is how this would work for multi-port inference pool. I presume the use case and reasoning would include those as well. How should that be handled? |
|
@elevran Thanks for raising this. My assumption in this PR is that the metrics endpoint is pod-level rather than serving-port-level. Even with a multi-port InferencePool, the common case is that a backend pod exposes a single If metricsPort is not set, the existing behavior remains unchanged and the port derived from MetricsHost is used as before. If there are use cases where metrics need to be differentiated per serving port, that would likely be a separate discussion at a different scope, potentially in InferencePool.spec. I was aiming to keep this PR scoped to the EPP scraping configuration, where a single process-level metrics port is the common case. Thanks! |
I don't think this is true of vLLM with data parallel > 1. I understand (and agree) with the use cases. But I'd request that we address this across the board and not as a special case.
|
|
PR needs rebase. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
@elevran Thanks for the detailed explanation — that makes sense. :) You're right that EPP already treats each serving port as a distinct endpoint and makes scheduling decisions based on per-port metrics. In that context, a single metrics port at the pod level is not always sufficient, especially for vLLM with data parallelism. The use case I had in mind (Istio mTLS STRICT where EPP's pod-IP scraping bypasses mTLS) still exists, but I agree it's better addressed as part of the InferencePool port group design rather than a plugin-level workaround. Happy to contribute toward that direction if useful — whether that's drafting a design proposal or working on an incremental PR. Thanks again! |
|
@janghyukjin perhaps resurrect #1396? There was no interest at the time hence closed due to lifecycle. Maybe your use case will resurrect the issue/discussion? |
|
@elevran That makes sense, thanks for the suggestion! Happy to revive #1396 and continue the discussion there. I can summarize the use case (separate metrics port, e.g. for environments like Istio mTLS STRICT) and how it relates to the current gaps in InferencePool port role specification. I'll go ahead and add context to #1396 to restart the discussion, unless you'd prefer a different approach. |
What type of PR is this?
/kind feature
What this PR does / why we need it:
Adds
metricsPortto themetrics-data-sourceplugin parameters so that EPP can scrape metrics from a port different from the inference port.The
metrics-data-sourceplugin currently derives the scrape target fromMetricsHost, which encodes the inference port. This breaks in deployments where the model server exposes metrics on a dedicated port (separate from inference), or where network/mesh policy makes the inference port unsuitable for direct scraping.--model-server-metrics-portpreviously covered this use case but was removed in #2441 without an equivalentEndpointPickerConfigparameter, leaving a gap. This PR fills that gap.When
metricsPortis0(default), behavior is unchanged — the port fromMetricsHostis used as-is.Invalid metricsPort values (outside 1–65535) are rejected at construction time.
Which issue(s) this PR fixes:
Related to #1396
Does this PR introduce a user-facing change?: