included llama3.1 8b small llm training scripts#799
Conversation
|
MLCommons CLA bot: |
|
@ZixianWangAMD may I ask for a dockerfile that I can use to test it on H100? Or at least a hand how to modify the existing dockerfile? |
|
Based on Training WG feedback, can you please change the folder name from small_language_model_pretraining to small_llm_pretraining since that is the agreed upon long name for this benchmark? |
| # warmup_steps = math.ceil(57600 * 8192 / 8192 / gbs * 0.1) | ||
| # # 230k samples | ||
| max_steps = math.ceil(230000 * 8192 / 8192 / gbs) | ||
| warmup_steps = math.ceil(230000 * 8192 / 8192 / gbs * 0.1) |
There was a problem hiding this comment.
max_steps should be fixed at 1.2M
warmup_steps should be parametrizable from the config
| Once Rclone is installed, run the following command to authenticate with the bucket: | ||
|
|
||
| ``` | ||
| rclone config create mlc-training s3 provider=Cloudflare access_key_id=76ea42eadb867e854061a1806220ee1e secret_access_key=a53625c4d45e3ca8ac0df8a353ea3a41ffc3292aa25259addd8b7dc5a6ce2936 endpoint=https://c2686074cb2caf5cbaf6d134bdba8b47.r2.cloudflarestorage.com |
There was a problem hiding this comment.
will there be a new pre-tokenized dataset to download? this still points the dataset for 405B
There was a problem hiding this comment.
Yes, this is being uploaded after which we will modify the instructions.
|
@ZixianWangAMD - can you please sign the CLA? |
Zixian has signed the CLA with user ZixianWangAMD, but the check is failing due to an incorrect Git Config. Commits were made locally with "Zixian Wang". GitHub usernames cannot have spaces. I did notice that Zixian appears to have multiple GitHub accounts. To set correct global configuration: OR Set local configuration for a specific repo: Once config is fixed, a rebase of the local repo will need to be done to fix the associated author of commits. |
|
Close as duplicate of #814 |
No description provided.