-
Notifications
You must be signed in to change notification settings - Fork 562
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable PPO on Intel XPU using a tiny model #2446
base: main
Are you sure you want to change the base?
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchtune/2446
Note: Links to docs will display an error until the docs builds have been completed. ❌ 1 New FailureAs of commit 09dd2a6 with merge base 0e8f840 ( NEW FAILURE - The following job has failed:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #2446 +/- ##
===========================================
- Coverage 65.38% 23.13% -42.26%
===========================================
Files 374 379 +5
Lines 22172 22793 +621
===========================================
- Hits 14498 5273 -9225
- Misses 7674 17520 +9846 ☔ View full report in Codecov by Sentry. |
Hi @songhappy. Thanks for opening this. It's great to see support in torchtune for different hardware and smaller models, but I'm hesitant to land a config for an |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @songhappy thanks for the PR! For this case I think it would make sense to host this config outside of torchtune core. If I understand correctly the main difference is that this will run on XPU + the usage of a tiny Llama model, right? We don't really have any configs with tiny llama (as far as I know) so I think it would be a bit strange to add for just this one case. Let me know if this makes sense to you, thanks!
Thanks for reviewing it. Actually, TinyLlama is not really tiny, it's 1B, and we have a couple of 1B or 0.5B model configurations in here. For example, https://github.com/pytorch/torchtune/blob/main/recipes/configs/llama3_2/1B_full_single_device.yaml, https://github.com/pytorch/torchtune/blob/main/recipes/configs/qwen2/0.5B_full_single_device.yaml. PPO uses way more memory than many other finetuning algorithms, given limited resources on single device, please consider one configuration using 1B model for PPO. In terms of default device of CUDA, I can test it on CUDA and update this PR. |
@songhappy thanks, that makes sense. Given it's already 1B, can we use the Llama 3.2 1B model instead? I think this should give better results anyways |
We don't have a classifier builder for 3.2. We can land #2356 soonish to enable this. |
Context
What is the purpose of this PR? Is it to
Please link to any issues this PR addresses.
https://jira.devtools.intel.com/browse/IPB-2914
Changelog
Added a configuration file of running PPO on Intel PVC 48G.
Test plan