Setup Guide

System Requirements
Installation
Downloading Checkpoints

System Requirements

NVIDIA GPUs with Ampere architecture (RTX 30 Series, A100) or newer
NVIDIA driver >=570.124.06 compatible with CUDA 12.8.1
Linux x86-64
glibc>=2.35 (e.g Ubuntu >=22.04)

Installation

sudo apt install git-lfs
git lfs install

Clone the repository:

git clone git@github.com:NVIDIA-Medtech/Cosmos-H-Surgical.git
cd Cosmos-H-Surgical
git lfs pull

Install one of the following environments:

Virtual Environment

Install system dependencies:

sudo apt update && sudo apt -y install curl ffmpeg libx11-dev tree wget

curl -LsSf https://astral.sh/uv/install.sh | sh
source $HOME/.local/bin/env

Install the package into a new environment:

cd transfer
uv python install
uv sync --extra=cu128
source .venv/bin/activate

Or, install the package into the active environment (e.g. conda):

uv sync --extra=cu128 --active --inexact

CUDA Variants:

CUDA Version	Arguments	Notes
CUDA 12.8	`--extra cu128`	NVIDIA Driver
CUDA 13.0	`--extra cu130`	NVIDIA Driver

For DGX Spark and Jetson AGX, you must use CUDA 13.0.

Docker Container

Please make sure you have access to Docker on your machine and the NVIDIA Container Toolkit is installed.

Build the container:

# Ampere - Hopper
image_tag=$(docker build -f Dockerfile -q .)
# Blackwell
image_tag=$(docker build -f docker/nightly.Dockerfile -q .)

Run the container:

docker run -it --runtime=nvidia --ipc=host --rm -v .:/workspace -v /workspace/.venv -v /root/.cache:/root/.cache -e HF_TOKEN="$HF_TOKEN" $image_tag

Optional arguments:

--ipc=host: Use host system's shared memory, since parallel torchrun consumes a large amount of shared memory. If not allowed by security policy, increase --shm-size (documentation).
-v /root/.cache:/root/.cache: Mount host cache to avoid re-downloading cache entries.
-e HF_TOKEN="$HF_TOKEN": Set Hugging Face token to avoid re-authenticating.

If you get docker: Error response from daemon: unknown or invalid runtime name: nvidia, you need to configure docker:

sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart docker

Downloading Checkpoints

Get a Hugging Face Access Token with Read permission
Install Hugging Face CLI: uv tool install -U "huggingface_hub[cli]"
Login: hf auth login
Accept the model license agreements on Hugging Face:
- Cosmos-Predict2.5-2B — base model used by Cosmos-H-Surgical-Predict
- Cosmos-Transfer2.5-2B — base model used by Cosmos-H-Surgical-Transfer
- Cosmos-Guardrail1 — guardrail model

Checkpoints are automatically downloaded during inference and post-training. To modify the checkpoint cache location, set the HF_HOME environment variable.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Setup Guide

System Requirements

Installation

Downloading Checkpoints

Uh oh!

FilesExpand file tree

setup.md

Latest commit

History

setup.md

File metadata and controls

Setup Guide

System Requirements

Installation

Downloading Checkpoints