Skip to content

ci: backport LLM container pipeline to release 1.3.1#1823

Open
kcywinski0 wants to merge 7 commits into
ai-dynamo:release/1.3.1from
kcywinski0:kcywinski0/release-1.3.1-pr-1664
Open

ci: backport LLM container pipeline to release 1.3.1#1823
kcywinski0 wants to merge 7 commits into
ai-dynamo:release/1.3.1from
kcywinski0:kcywinski0/release-1.3.1-pr-1664

Conversation

@kcywinski0

Copy link
Copy Markdown
Contributor

Summary

  • Backports the LLM container build/test pipeline changes from ci: add LLM container build pipeline for vllm/sglang images #1664 onto release/1.3.1.
  • Adds the missing build/test matrix files and vLLM/SGLang container test scripts needed by the Jenkins LLM container jobs.
  • Includes the related Dockerfile and JJB updates from the original PR.

Test plan

isdrk and others added 4 commits June 24, 2026 12:56
Adds a new Jenkins matrix pipeline that builds the three NIXL inference
container variants (vllm-nixl, sglang-nixl, sglang-cu13-nixl) from a
published NIXL wheel set, for x86_64 and aarch64, and publishes
multi-arch manifests. Mirrors the manual procedure documented at
README.md so
that release-candidate inference images can be cut without running the
build steps by hand.

Uses native podman for build/push/manifest operations (no docker on the
build pod). Cross-arch aarch64 builds run on x86_64 hosts via QEMU.

Signed-off-by: Iaroslav Sydoruk <isydoruk@nvidia.com>
Signed-off-by: Iaroslav Sydoruk <isydoruk@nvidia.com>
TinyLlama/TinyLlama-1.1B-Chat-v1.0 (~550MB) — smoke/perf for both vllm and sglang
Qwen/Qwen3-8B (~16GB) — accuracy tests for sglang only (vllm accuracy is a no-op)

Signed-off-by: Iaroslav Sydoruk <isydoruk@nvidia.com>
@kcywinski0 kcywinski0 requested review from a team as code owners June 24, 2026 10:56
@copy-pr-bot

copy-pr-bot Bot commented Jun 24, 2026

Copy link
Copy Markdown

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@github-actions

Copy link
Copy Markdown

👋 Hi kcywinski0! Thank you for contributing to ai-dynamo/nixl.

Your PR reviewers will review your contribution then trigger the CI to test your changes.

🚀

Replace the deprecated huggingface-cli invocation so LLM container builds can prefetch models with current Hugging Face images.
Use huggingface_hub.snapshot_download directly so LLM container builds do not depend on deprecated or missing Hugging Face CLI entrypoints.
Install or upgrade huggingface_hub before prefetching LLM test models so container builds do not depend on base image contents.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants