-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Pull requests: huggingface/trl
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[docs] Add hardware requirements note to quickstart
#5472
opened Apr 7, 2026 by
pqbas
Loading…
5 of 8 tasks
Add
{% generation %} support to training chat templates
#5470
opened Apr 7, 2026 by
qgallouedec
Loading…
Add
supports_tool_calling utility and validate tool support at init
#5462
opened Apr 6, 2026 by
qgallouedec
Loading…
Move chat templates from inline strings to
.jinja files
#5459
opened Apr 5, 2026 by
qgallouedec
Loading…
[docs] Clarify dtype defaults between trf v5 and TRL
#5457
opened Apr 4, 2026 by
casinca
Loading…
2 of 4 tasks
[AsyncGRPO] Support async tool calls in AsyncRolloutWorker
#5446
opened Apr 3, 2026 by
PoilZero
Loading…
5 of 8 tasks
feat(async-grpo): add sampling parameter parity
#5418
opened Mar 31, 2026 by
kdubovikov
Loading…
4 of 8 tasks
fix(async-grpo): honor model init dtype
#5416
opened Mar 31, 2026 by
kdubovikov
Loading…
3 of 8 tasks
Skip redundant forward pass for on-policy vLLM importance sampling
#5413
opened Mar 31, 2026 by
GJ98
Loading…
3 of 8 tasks
Add
log_multimodal param to GRPOConfig and RLOOConfig to control image logging
#5408
opened Mar 30, 2026 by
apardyl
Loading…
3 of 8 tasks
Add
DistillationTrainer for efficient on-policy distillation
#5407
opened Mar 30, 2026 by
cmpatino
Loading…
3 of 5 tasks
Add length-normalized sigmoid loss type to DPO trainer
#5406
opened Mar 30, 2026 by
BrownianNotion
Loading…
5 of 8 tasks
Add per-sample tool filtering to GRPOTrainer via
tools column
#5398
opened Mar 27, 2026 by
lailanelkoussy
Loading…
3 tasks done
feat(grpo): add stop_tool_names for immediate agent loop termination
#5390
opened Mar 27, 2026 by
lailanelkoussy
Loading…
Fix DAPO token-level loss to use prompt-level aggregation
#5381
opened Mar 26, 2026 by
matdou
Loading…
2 of 5 tasks
[vllm-serve] Add extra_llm_kwargs for passing additional arguments to vllm.LLM()
#5367
opened Mar 25, 2026 by
jonahsamost
Loading…
1 of 5 tasks
Previous Next
ProTip!
Updated in the last three days: updated:>2026-04-04.