Skip to content

Pull requests: huggingface/trl

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

[docs] Add hardware requirements note to quickstart
#5472 opened Apr 7, 2026 by pqbas Loading…
5 of 8 tasks
Add Qwen3-VL tool calling support
#5469 opened Apr 7, 2026 by qgallouedec Loading…
Add GPT-OSS tool calling support
#5464 opened Apr 6, 2026 by qgallouedec Loading…
Add GLM-4-MoE tool calling support
#5463 opened Apr 6, 2026 by qgallouedec Loading…
GOLDTrainer VLM support
#5461 opened Apr 6, 2026 by Strongich Loading…
4 of 8 tasks
[docs] Clarify dtype defaults between trf v5 and TRL
#5457 opened Apr 4, 2026 by casinca Loading…
2 of 4 tasks
Gemma 4 support
#5453 opened Apr 4, 2026 by qgallouedec Loading…
[AsyncGRPO] Support async tool calls in AsyncRolloutWorker
#5446 opened Apr 3, 2026 by PoilZero Loading…
5 of 8 tasks
FIPO loss
#5434 opened Apr 2, 2026 by kdubovikov Loading…
4 of 8 tasks
feat(async-grpo): add sampling parameter parity
#5418 opened Mar 31, 2026 by kdubovikov Loading…
4 of 8 tasks
Delta weight sync using Xet buckets
#5417 opened Mar 31, 2026 by AmineDiro Draft
8 tasks
fix(async-grpo): honor model init dtype
#5416 opened Mar 31, 2026 by kdubovikov Loading…
3 of 8 tasks
Skip redundant forward pass for on-policy vLLM importance sampling
#5413 opened Mar 31, 2026 by GJ98 Loading…
3 of 8 tasks
add JEPO trainer
#5411 opened Mar 31, 2026 by zbills Loading…
3 of 7 tasks
Add DistillationTrainer for efficient on-policy distillation
#5407 opened Mar 30, 2026 by cmpatino Loading…
3 of 5 tasks
Add length-normalized sigmoid loss type to DPO trainer
#5406 opened Mar 30, 2026 by BrownianNotion Loading…
5 of 8 tasks
Add per-sample tool filtering to GRPOTrainer via tools column
#5398 opened Mar 27, 2026 by lailanelkoussy Loading…
3 tasks done
Fix DAPO token-level loss to use prompt-level aggregation
#5381 opened Mar 26, 2026 by matdou Loading…
2 of 5 tasks
ProTip! Updated in the last three days: updated:>2026-04-04.