generated from fastai/nbdev_template
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Pull requests: huggingface/trl
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Fix reward_processing_classes validation in GRPOTrainer (#2839)
#3876
opened Aug 9, 2025 by
chi2liu
Loading…
4 tasks done
Optimize truncate_with_protected_tokens to use vectorized operations
#3875
opened Aug 9, 2025 by
chi2liu
Loading…
2 of 5 tasks
Optimize completion_ids list conversion in GRPO trainer
#3874
opened Aug 9, 2025 by
chi2liu
Loading…
2 of 5 tasks
Improve quickstart documentation with updated API examples
#3873
opened Aug 9, 2025 by
behroozazarkhalili
Loading…
vLLM rollout numerical differences causing off-policy RL.
#3867
opened Aug 7, 2025 by
LeonEricsson
•
Draft
5 tasks
Replaced
unittest.TestCase
with TrlTestCase
that handles tmp dir
#3863
opened Aug 7, 2025 by
qgallouedec
Loading…
[#3647] Fix: Assign default values in the GKDTrainer's constructor only when …
#3851
opened Aug 5, 2025 by
seungduk-yanolja
Loading…
2 of 5 tasks
Update profiling.py: fix scoping problems for wandb and mlflow
#3845
opened Aug 4, 2025 by
markshinyounglee
Loading…
5 tasks done
Optimize RLOO Trainer memory usage with string-level processing
#3837
opened Aug 2, 2025 by
luckyvickyricky
Loading…
2 of 5 tasks
Fix SFTTrainer token accuracy computation with PromptEncoder
#3821
opened Jul 31, 2025 by
zk-quantum
Loading…
5 tasks done
GSPO docs - Sequence importance ratio and differences in relation to GRPO
#3816
opened Jul 31, 2025 by
almeidava93
Loading…
2 of 5 tasks
Adding support for different losses which are now supported by Liger
#3815
opened Jul 31, 2025 by
Manan17
Loading…
1 of 5 tasks
💇 Add soft overlong punishment reward function and update documentation
#3804
opened Jul 30, 2025 by
qgallouedec
Loading…
Previous Next
ProTip!
Find all pull requests that aren't related to any open issues with -linked:issue.