feat: middleware JSON→form-data conversion for /v1/videos + async end… by Yadan-Wei · Pull Request #5890 · aws/deep-learning-containers

Yadan-Wei · 2026-04-07T16:47:58Z

feat: SageMaker middleware JSON→form-data conversion for video generation

Summary

Adds automatic JSON-to-multipart/form-data conversion in the SageMaker routing middleware for endpoints that require form data (e.g., /v1/videos). This enables video
generation models (Wan2.1-T2V) to work through SageMaker async inference, where payloads are always sent as JSON.

Also adds an async SageMaker endpoint integration test for video generation.

Problem

SageMaker async inference sends payloads as JSON to the container's /invocations endpoint. However, vllm-omni's /v1/videos API only accepts multipart/form-data. Without
conversion, the video endpoint returns 400 Bad Request: Field required because it can't parse the JSON body.

Changes

Middleware (scripts/vllm/omni_sagemaker_serve.py)

Added FORM_DATA_ROUTES set: {"/v1/videos", "/v1/videos/sync"}
When a request targets a form-data route with Content-Type: application/json, the middleware:
1. Reads the JSON body
2. Builds a multipart/form-data body with a random boundary
3. Replaces the Content-Type header and request body
Non-JSON requests (e.g., direct curl with -F) pass through unchanged
Non-video routes (TTS, image, chat) pass through unchanged

Unit Tests (test/vllm-omni/sagemaker/test_sagemaker_middleware.py)

test_json_to_formdata_for_video_route — verifies JSON→form-data conversion on /v1/videos
test_json_passthrough_for_non_video_route — verifies JSON is NOT converted on /v1/audio/speech
test_formdata_passthrough_for_video_route — verifies form-data input passes through unchanged

Endpoint Test (test/vllm-omni/sagemaker/test_sm_omni_endpoint.py)

test_vllm_omni_video_async_endpoint — deploys Wan2.1-T2V-1.3B on ml.g5.12xlarge (TP=2) as async endpoint
Sends JSON payload with CustomAttributes: route=/v1/videos
Validates the response contains a video job id (proves middleware routing + JSON→form-data conversion work end-to-end)
Model loaded from S3 tarball (s3://dlc-cicd-models/omni-models/wan2.1-t2v-1.3b.tar.gz)

Current Limitations: SageMaker Video Model Support

Async job ID only — vllm-omni v0.18.0's /v1/videos endpoint is async by design. It returns a job ID immediately (status: "queued") and generates the video in the background.
SageMaker async inference writes this JSON response to S3, not the actual MP4 file. To retrieve the video, you would need to poll GET /v1/videos/{id} and download from
GET /v1/videos/{id}/content, which isn't possible through SageMaker's request/response model.
No sync video endpoint in v0.18.0 — POST /v1/videos/sync (which blocks and returns raw MP4 bytes) is not available in vllm-omni v0.18.0 (returns 405). It has been added in
newer commits (c9dbc09). Once we
upgrade vllm-omni, switching the route to /v1/videos/sync will return actual MP4 bytes through SageMaker async — the middleware already includes /v1/videos/sync in
FORM_DATA_ROUTES.
GPU requirements — Wan2.1-T2V-1.3B requires ~24GB VRAM. On ml.g5.12xlarge (4× A10G 24GB each), tensor_parallel_size=2 is needed. On instances with a single ≥48GB GPU (e.g.,
ml.g6e.xlarge with L40S), TP=1 works.

Testing

15 unit tests passing (12 existing + 3 new)
Pre-commit clean

Toggle if you are merging into master Branch

By default, docker image builds and tests are disabled. Two ways to run builds and tests:

Using dlc_developer_config.toml
Using this PR description (currently only supported for PyTorch, TensorFlow, vllm, and base images)

How to use the helper utility for updating dlc_developer_config.toml

Assuming your remote is called origin (you can find out more with git remote -v)...

Run default builds and tests for a particular buildspec - also commits and pushes changes to remote; Example:

python src/prepare_dlc_dev_environment.py -b </path/to/buildspec.yml> -cp origin

Enable specific tests for a buildspec or set of buildspecs - also commits and pushes changes to remote; Example:

python src/prepare_dlc_dev_environment.py -b </path/to/buildspec.yml> -t sanity_tests -cp origin

Restore TOML file when ready to merge

python src/prepare_dlc_dev_environment.py -rcp origin

NOTE: If you are creating a PR for a new framework version, please ensure success of the local, standard, rc, and efa sagemaker tests by updating the dlc_developer_config.toml file:

sagemaker_remote_tests = true
sagemaker_efa_tests = true
sagemaker_rc_tests = true
sagemaker_local_tests = true

How to use PR description

Use the code block below to uncomment commands and run the PR CodeBuild jobs. There are two commands available:

# /buildspec <buildspec_path>
- e.g.: # /buildspec pytorch/training/buildspec.yml
- If this line is commented out, dlc_developer_config.toml will be used.
# /tests <test_list>
- e.g.: # /tests sanity security ec2
- If this line is commented out, it will run the default set of tests (same as the defaults in dlc_developer_config.toml): sanity, security, ec2, ecs, eks, sagemaker, sagemaker-local.

# /buildspec <buildspec_path>
# /tests <test_list>

Toggle if you are merging into main Branch

PR Checklist

[] I ran pre-commit run --all-files locally before creating this PR. (Read DEVELOPMENT.md for details).

…point test - Middleware auto-converts JSON to multipart/form-data for FORM_DATA_ROUTES (SageMaker async only supports JSON payloads, but /v1/videos needs form-data) - 3 new unit tests: conversion, passthrough for non-video, passthrough for form-data - Video async endpoint test: Wan2.1-T2V on g5.12xlarge (TP=2), validates job id

…rade

aws-deep-learning-containers-ci bot added authorized Size:XL Determines the size of the PR labels Apr 7, 2026

Yadan Wei and others added 10 commits April 7, 2026 09:51

style: ruff format

4013850

fix: add /v1/videos/sync to FORM_DATA_ROUTES for future vllm-omni upg…

492bb61

…rade

fix: remove video async endpoint test (g5.12xl capacity unavailable)

e444947

Merge branch 'main' into omni-video

57a58bf

style: ruff format

4cbe91f

Merge branch 'main' into omni-video

c754554

Merge branch 'main' into omni-video

ab1252d

Merge branch 'main' into omni-video

97753cb

Merge branch 'main' into omni-video

362a232

Merge branch 'main' into omni-video

8c685c3

Yadan-Wei enabled auto-merge (squash) April 8, 2026 01:17

jinyan-li1 approved these changes Apr 8, 2026

View reviewed changes

Yadan-Wei merged commit d1d9b7e into main Apr 8, 2026
95 of 97 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: middleware JSON→form-data conversion for /v1/videos + async end…#5890

feat: middleware JSON→form-data conversion for /v1/videos + async end…#5890
Yadan-Wei merged 11 commits intomainfrom
omni-video

Yadan-Wei commented Apr 7, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Yadan-Wei commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

feat: SageMaker middleware JSON→form-data conversion for video generation

Summary

Problem

Changes

Current Limitations: SageMaker Video Model Support

Testing

PR Checklist

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Yadan-Wei commented Apr 7, 2026 •

edited

Loading