Skip to content

Add MLCube implementation for LLaMA 3.1#792

Merged
ShriyaRishab merged 4 commits into
mlcommons:masterfrom
davidjurado:feature/mlcube_llama3.1
Aug 8, 2025
Merged

Add MLCube implementation for LLaMA 3.1#792
ShriyaRishab merged 4 commits into
mlcommons:masterfrom
davidjurado:feature/mlcube_llama3.1

Conversation

@davidjurado

@davidjurado davidjurado commented May 8, 2025

Copy link
Copy Markdown
Contributor

MLCube for LLaMA 3.1

MLCube™ GitHub repository. MLCube™ wiki.

Project setup

An important requirement is that you must have Docker installed.

# Create Python environment and install MLCube Docker runner 
virtualenv -p python3 ./env && source ./env/bin/activate && pip install mlcube-docker
# Fetch the implementation from GitHub
git clone https://github.com/mlcommons/training && cd ./training
git fetch origin pull/792/head:feature/mlcube_llama3.1 && git checkout feature/mlcube_llama3.1
cd ./large_language_model_pretraining/nemo/mlcube

Inside the mlcube directory run the following command to check implemented tasks.

mlcube describe

Extra requirements

Nvidia Driver

The base Docker image requires the host machine to have the NVIDIA driver version 560.35.03 installed.

Rclone

Install Rclone in your system, by following these instructions.

MLCommons hosts the model for download exclusively by MLCommons Members. You must first agree to the confidentiality notice. If you cannot access the form, follow these intructions.

When finishing the previous form, you will receive an email with access to the Drive folder containing a file called Llama 3.1 CLI Download Instructions, follow the instructions inside that file up to step: 3. Authenticate Rclone with Google Drive.

When finishing this step a configuration file for Rclone will contain the necessary data to download the dataset and models. To check where this file is located run the command:

rclone config file

Default: ~/.config/rclone/rclone.conf

Finally copy that file inside the workspace folder that is located in the same path as this readme, it must have the name rclone.conf.

MLCube tasks

  • Demo tasks:

Download demo dataset.

mlcube run --task=download_demo -Pdocker.build_strategy=always

Train demo.

mlcube run --task=demo -Pdocker.build_strategy=always

Execute the complete pipeline

You can execute the complete pipeline with one single command.

  • Demo pipeline:
mlcube run --task=download_demo,demo -Pdocker.build_strategy=always

@davidjurado davidjurado requested a review from a team as a code owner May 8, 2025 15:20
@github-actions

github-actions Bot commented May 8, 2025

Copy link
Copy Markdown

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

@ShriyaRishab ShriyaRishab merged commit 903b3fe into mlcommons:master Aug 8, 2025
@github-actions github-actions Bot locked and limited conversation to collaborators Aug 8, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants