Skip to content

Commit c68ada2

Browse files
authored
Update benchmark dataset reference in checklist
Updated benchmark dataset reference from GPQA to HLE in the model release checklist.
1 parent 1237cc5 commit c68ada2

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

docs/hub/model-release-checklist.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -242,7 +242,7 @@ To add evaluation results, create YAML files in the `.eval_results/` folder of y
242242
user: your-username
243243
```
244244
245-
The `task_id` must match a task defined in the benchmark dataset's `eval.yaml`. You can find available benchmarks and their task IDs by checking the `eval.yaml` file in benchmark dataset repos like [GPQA](https://huggingface.co/datasets/Idavidrein/gpqa/blob/main/eval.yaml).
245+
The `task_id` must match a task defined in the benchmark dataset's `eval.yaml`. You can find available benchmarks and their task IDs by checking the `eval.yaml` file in benchmark dataset repos like [HLE](https://huggingface.co/datasets/cais/hle/blob/main/eval.yaml).
246246

247247
Anyone in the community can also submit evaluation results to any model by opening a Pull Request. Community-submitted scores display a "community" badge on the model page. To streamline this process, you can use the [community-evals](https://github.com/huggingface/community-evals) repository, which provides scripts and an agent skill for extracting scores from model cards and creating PRs automatically.
248248

0 commit comments

Comments
 (0)