- NVIDIA GPUs with Ampere architecture (RTX 30 Series, A100) or newer
- NVIDIA driver >=570.124.06 compatible with CUDA 12.8.1
- Linux x86-64
- glibc>=2.35 (e.g Ubuntu >=22.04)
Install git lfs:
sudo apt install git-lfs
git lfs installClone the repository:
git clone git@github.com:NVIDIA-Medtech/Cosmos-H-Surgical.git
cd Cosmos-H-Surgical
git lfs pullInstall one of the following environments:
Virtual Environment
Install system dependencies:
sudo apt update && sudo apt -y install curl ffmpeg libx11-dev tree wgetcurl -LsSf https://astral.sh/uv/install.sh | sh
source $HOME/.local/bin/envInstall the package into a new environment:
cd transfer
uv python install
uv sync --extra=cu128
source .venv/bin/activateOr, install the package into the active environment (e.g. conda):
uv sync --extra=cu128 --active --inexactCUDA Variants:
| CUDA Version | Arguments | Notes |
|---|---|---|
| CUDA 12.8 | --extra cu128 |
NVIDIA Driver |
| CUDA 13.0 | --extra cu130 |
NVIDIA Driver |
For DGX Spark and Jetson AGX, you must use CUDA 13.0.
Docker Container
Please make sure you have access to Docker on your machine and the NVIDIA Container Toolkit is installed.
Build the container:
# Ampere - Hopper
image_tag=$(docker build -f Dockerfile -q .)
# Blackwell
image_tag=$(docker build -f docker/nightly.Dockerfile -q .)Run the container:
docker run -it --runtime=nvidia --ipc=host --rm -v .:/workspace -v /workspace/.venv -v /root/.cache:/root/.cache -e HF_TOKEN="$HF_TOKEN" $image_tagOptional arguments:
--ipc=host: Use host system's shared memory, since parallel torchrun consumes a large amount of shared memory. If not allowed by security policy, increase--shm-size(documentation).-v /root/.cache:/root/.cache: Mount host cache to avoid re-downloading cache entries.-e HF_TOKEN="$HF_TOKEN": Set Hugging Face token to avoid re-authenticating.
If you get docker: Error response from daemon: unknown or invalid runtime name: nvidia, you need to configure docker:
sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart docker- Get a Hugging Face Access Token with
Readpermission - Install Hugging Face CLI:
uv tool install -U "huggingface_hub[cli]" - Login:
hf auth login - Accept the model license agreements on Hugging Face:
- Cosmos-Predict2.5-2B — base model used by Cosmos-H-Surgical-Predict
- Cosmos-Transfer2.5-2B — base model used by Cosmos-H-Surgical-Transfer
- Cosmos-Guardrail1 — guardrail model
Checkpoints are automatically downloaded during inference and post-training. To modify the checkpoint cache location, set the HF_HOME environment variable.