-
Notifications
You must be signed in to change notification settings - Fork 4
Pull requests: scitix/sieval
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
feat(c-eval): add dataset and few-shot base-model task
#15
opened Jun 24, 2026 by
jack-scitix-ai
Contributor
Loading…
14 tasks done
feat(livecodebench): add dataset and few-shot base-model task
#14
opened Jun 23, 2026 by
jack-scitix-ai
Contributor
Loading…
14 tasks done
feat(ifbench): add dataset and few-shot base-model task
#13
opened Jun 23, 2026 by
jack-scitix-ai
Contributor
Loading…
18 tasks done
feat(mbpp): add dataset and few-shot base-model task
#12
opened Jun 22, 2026 by
jack-scitix-ai
Contributor
Loading…
14 tasks done
feat(ruler): add RULER long-context benchmark (datasets + tasks)
#11
opened Jun 22, 2026 by
Mea1Ma
Loading…
21 tasks done
feat(cmmlu): add CMMLU few-shot base-model task and dataset
#10
opened Jun 22, 2026 by
jack-scitix-ai
Contributor
Loading…
13 tasks done
feat(datasets): add local: source handler and fix url: gzip truncation check
#9
opened Jun 22, 2026 by
Mea1Ma
Loading…
8 tasks done
feat(datasets): pin HF revisions + checksum URL datasets, enforced in preflight
#8
opened Jun 20, 2026 by
ethan-scitix
Collaborator
Loading…
12 tasks done
feat(datasets): add stratified_select operation for group-balanced sampling
#7
opened Jun 20, 2026 by
ethan-scitix
Collaborator
Loading…
9 tasks done
feat(humaneval): add HumanEval 0-shot base model task
#6
opened Jun 18, 2026 by
jack-scitix-ai
Contributor
Loading…
14 tasks done
feat(session): tolerate throughput-only config diffs on --resume
#4
opened Jun 17, 2026 by
ethan-scitix
Collaborator
Loading…
9 tasks done
feat(theoremqa): add dataset and k-shot base-model task
#3
opened Jun 17, 2026 by
jack-scitix-ai
Contributor
Loading…
15 tasks done
ProTip!
What’s not been updated in a month: updated:<2026-05-24.