Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable PPO on Intel XPU using a tiny model #2446

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

songhappy
Copy link
Contributor

Context

What is the purpose of this PR? Is it to

  • add a new feature

Please link to any issues this PR addresses.
https://jira.devtools.intel.com/browse/IPB-2914

Changelog

Added a configuration file of running PPO on Intel PVC 48G.

Test plan

  • manually run any new or modified recipes with sufficient proof of correctness
  • I did not change any public API

Copy link

pytorch-bot bot commented Feb 27, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchtune/2446

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure

As of commit 09dd2a6 with merge base 0e8f840 (image):

NEW FAILURE - The following job has failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 27, 2025
@SalmanMohammadi SalmanMohammadi self-requested a review March 2, 2025 13:21
@codecov-commenter
Copy link

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 23.13%. Comparing base (cf0142b) to head (09dd2a6).
Report is 13 commits behind head on main.

Additional details and impacted files
@@             Coverage Diff             @@
##             main    #2446       +/-   ##
===========================================
- Coverage   65.38%   23.13%   -42.26%     
===========================================
  Files         374      379        +5     
  Lines       22172    22793      +621     
===========================================
- Hits        14498     5273     -9225     
- Misses       7674    17520     +9846     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@SalmanMohammadi
Copy link
Collaborator

Hi @songhappy. Thanks for opening this. It's great to see support in torchtune for different hardware and smaller models, but I'm hesitant to land a config for an xpu device as we won't be able to test and maintain it going forward. We generally test and ship our configs on cuda and allow users to configure it on their own.

Copy link
Contributor

@ebsmothers ebsmothers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @songhappy thanks for the PR! For this case I think it would make sense to host this config outside of torchtune core. If I understand correctly the main difference is that this will run on XPU + the usage of a tiny Llama model, right? We don't really have any configs with tiny llama (as far as I know) so I think it would be a bit strange to add for just this one case. Let me know if this makes sense to you, thanks!

@songhappy
Copy link
Contributor Author

Thanks for reviewing it. Actually, TinyLlama is not really tiny, it's 1B, and we have a couple of 1B or 0.5B model configurations in here. For example, https://github.com/pytorch/torchtune/blob/main/recipes/configs/llama3_2/1B_full_single_device.yaml, https://github.com/pytorch/torchtune/blob/main/recipes/configs/qwen2/0.5B_full_single_device.yaml.

PPO uses way more memory than many other finetuning algorithms, given limited resources on single device, please consider one configuration using 1B model for PPO.

In terms of default device of CUDA, I can test it on CUDA and update this PR.
Many thanks again. @SalmanMohammadi @ebsmothers

@ebsmothers
Copy link
Contributor

@songhappy thanks, that makes sense. Given it's already 1B, can we use the Llama 3.2 1B model instead? I think this should give better results anyways

@SalmanMohammadi
Copy link
Collaborator

@songhappy thanks, that makes sense. Given it's already 1B, can we use the Llama 3.2 1B model instead? I think this should give better results anyways

We don't have a classifier builder for 3.2. We can land #2356 soonish to enable this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants