vpa/admission-controller: limit request payload size to 5MB by sophieliu15 · Pull Request #9690 · kubernetes/autoscaler

sophieliu15 · 2026-05-25T18:26:30Z

Add a defensive 5MB payload size cap using on the HTTP request body in the VPA Admission Controller webhook.

This prevents arbitrary/maliciously large payload requests from exhausting memory resources and triggering Out-of-Memory (OOM) crashes (Denial of Service). The webhook continues to fail open on reading or unmarshaling errors to protect cluster scheduling availability.

What type of PR is this?

/kind bug

What this PR does / why we need it:

This PR adds a defensive 5MB payload size limit to the Vertical Pod Autoscaler (VPA) admission controller webhook server.

Prior to this change, the webhook handler used io.ReadAll(r.Body) to read the incoming HTTP request body into memory without bounds checking. A maliciously large payload (e.g. an endless stream of data) sent to the webhook endpoint could consume all memory resources and trigger an Out-of-Memory (OOM) crash. Since the admission controller is a critical component for scheduling, its crash could lead to a Denial of Service (DoS) blocking all new pod creation in the cluster if configured to fail closed.

By wrapping r.Body with io.LimitReader(r.Body, 5MB), we safely cap the maximum memory allocation during the read step. If the payload is truncated or fails parsing, the webhook gracefully respects its permissive "fail-open" contract to allow scheduling to proceed, while protecting its own server process from OOM.

Special notes for your reviewer:

The 5MB limit was selected to comfortably accommodate valid massive Kubernetes update requests (which contain both the object and oldObject payloads, each up to etcd's default 1.5MB limit plus metadata overhead) while remaining well below standard container memory limits to eliminate OOM risk.

Unit tests have been added to server_test.go to cover both normal and oversized payload scenarios.

Does this PR introduce a user-facing change?

NONE

k8s-ci-robot · 2026-05-25T18:26:32Z

Adding the "do-not-merge/release-note-label-needed" label because no release-note block was detected, please follow our release note process to remove it.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

k8s-ci-robot · 2026-05-25T18:26:38Z

This issue is currently awaiting triage.

If SIG Autoscaling contributors determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

k8s-ci-robot · 2026-05-25T18:26:39Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: sophieliu15
Once this PR has been reviewed and has the lgtm label, please assign kwiesmueller for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

vertical-pod-autoscaler/OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot · 2026-05-25T18:26:40Z

Hi @sophieliu15. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work.

Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

adrianmoisey

Heya,
Thanks for the PR.
I've got two small comments for you

adrianmoisey · 2026-05-25T19:05:45Z

+		{
+			name:           "Oversized payload exceeding 5MB limit",
+			isEndless:      true,
+			expectedStatus: http.StatusOK, // Fails open, returning 200 OK with unmarshal error rather than crashing


Should we rather be expecting a 413 from the server here?

Good question! I chose to let the read error "fail open" (proceeding with a 200 OK rather than a 413) for two reasons:

Preserving Webhook Behavior: This design represents a minimal change that strictly preserves the VPA webhook's existing, permissive "fail-open" philosophy for JSON parsing/unmarshaling errors (which defaults to Allowed: true). It guarantees the webhook is protected from OOM crashes without changing VPA's business logic or breaking backward compatibility.

Preventing Cluster Scheduling Outages Under DoS: Mutating webhooks are inline to all Pod creations/updates. If the webhook is under a heavy DoS flood and we return a 413 HTTP error:

The API Server treats non-200 responses as webhook infrastructure failures.

If the cluster operator has VPA configured with failurePolicy: Fail, the API Server will immediately block and reject all legitimate user Pod scheduling requests.

By using http.MaxBytesReader combined with 200 OK, we discard oversized payloads with minimal resource consumption and do not report this as infrastructure failure to the API Server, ensuring minimal impact on cluster-wide scheduling under DoS.

I'm not sure that's true. This changes the logic to respond with a 200, without the expected payload, see https://kubernetes.io/docs/reference/access-authn-authz/extensible-admission-controllers/#response
api-server will treat this as a failure, no matter the HTTP code.

adrianmoisey · 2026-05-25T19:08:09Z

 	var body []byte
 	if r.Body != nil {
-		if data, err := io.ReadAll(r.Body); err == nil {
+		if data, err := io.ReadAll(io.LimitReader(r.Body, maxAdmissionPayloadSize)); err == nil {


Does it make sense to rather use http.MaxBytesReader ? It seems to be purpose built for HTTP

Great catch! I have updated the PR to use http.MaxBytesReader instead of io.LimitReader.

Add a defensive 5MB payload size cap using on the HTTP request body in the VPA Admission Controller webhook. This prevents arbitrary/maliciously large payload requests from exhausting memory resources and triggering Out-of-Memory (OOM) crashes (Denial of Service). The webhook continues to fail open on reading or unmarshaling errors to protect cluster scheduling availability.

omerap12 · 2026-05-25T21:17:31Z

+		if data, err := io.ReadAll(http.MaxBytesReader(w, r.Body, maxAdmissionPayloadSize)); err == nil {
 			body = data
+		} else {
+			klog.ErrorS(err, "Failed to read admission request body (payload may exceed 5MB limit)")


How can we be sure that the error here is because of that reason?
Also, printing out 5MB without a reference to the const value can cause errors.
I would expect something like this:

diff --git a/vertical-pod-autoscaler/pkg/admission-controller/logic/server.go b/vertical-pod-autoscaler/pkg/admission-controller/logic/server.go index d4ee79e64..47c74d431 100644 --- a/vertical-pod-autoscaler/pkg/admission-controller/logic/server.go +++ b/vertical-pod-autoscaler/pkg/admission-controller/logic/server.go @@ -19,6 +19,7 @@ package logic import ( "context" "encoding/json" + "errors" "fmt" "io" "net/http" @@ -200,7 +201,12 @@ func (s *AdmissionServer) Serve(w http.ResponseWriter, r *http.Request) { if data, err := io.ReadAll(http.MaxBytesReader(w, r.Body, maxAdmissionPayloadSize)); err == nil { body = data } else { - klog.ErrorS(err, "Failed to read admission request body (payload may exceed 5MB limit)") + var maxBytesErr *http.MaxBytesError + if errors.As(err, &maxBytesErr) { + klog.ErrorS(err, "Admission request body exceeds size limit", "limit", maxAdmissionPayloadSize) + } else { + klog.ErrorS(err, "Failed to read admission request body") + } } }

omerap12 · 2026-05-25T21:17:50Z

+}
+
+func TestServePayloadLimit(t *testing.T) {
+	tests := []struct {


Can we add more tests here for my above comment?

k8s-ci-robot requested review from kwiesmueller and omerap12 May 25, 2026 18:26

k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels May 25, 2026

sophieliu15 force-pushed the fix-511328366 branch from daf6643 to d8dd4c0 Compare May 25, 2026 19:01

adrianmoisey reviewed May 25, 2026

View reviewed changes

sophieliu15 force-pushed the fix-511328366 branch from d8dd4c0 to 273dd83 Compare May 25, 2026 19:25

omerap12 reviewed May 25, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

vpa/admission-controller: limit request payload size to 5MB#9690

vpa/admission-controller: limit request payload size to 5MB#9690
sophieliu15 wants to merge 1 commit into
kubernetes:masterfrom
sophieliu15:fix-511328366

sophieliu15 commented May 25, 2026 •

edited

Loading

Uh oh!

k8s-ci-robot commented May 25, 2026

Uh oh!

k8s-ci-robot commented May 25, 2026

Uh oh!

k8s-ci-robot commented May 25, 2026

Uh oh!

k8s-ci-robot commented May 25, 2026

Uh oh!

adrianmoisey left a comment

Uh oh!

adrianmoisey May 25, 2026

Uh oh!

sophieliu15 May 25, 2026 •

edited

Loading

Uh oh!

adrianmoisey May 26, 2026

Uh oh!

adrianmoisey May 25, 2026

Uh oh!

sophieliu15 May 25, 2026 •

edited

Loading

Uh oh!

omerap12 May 25, 2026

Uh oh!

omerap12 May 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

sophieliu15 commented May 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What type of PR is this?

What this PR does / why we need it:

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Uh oh!

k8s-ci-robot commented May 25, 2026

Uh oh!

k8s-ci-robot commented May 25, 2026

Uh oh!

k8s-ci-robot commented May 25, 2026

Uh oh!

k8s-ci-robot commented May 25, 2026

Uh oh!

adrianmoisey left a comment

Choose a reason for hiding this comment

Uh oh!

adrianmoisey May 25, 2026

Choose a reason for hiding this comment

Uh oh!

sophieliu15 May 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

adrianmoisey May 26, 2026

Choose a reason for hiding this comment

Uh oh!

adrianmoisey May 25, 2026

Choose a reason for hiding this comment

Uh oh!

sophieliu15 May 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

omerap12 May 25, 2026

Choose a reason for hiding this comment

Uh oh!

omerap12 May 25, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

sophieliu15 commented May 25, 2026 •

edited

Loading

sophieliu15 May 25, 2026 •

edited

Loading

sophieliu15 May 25, 2026 •

edited

Loading