Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training VLM #74

Open
faithfulnguyen opened this issue Feb 18, 2025 · 1 comment
Open

Training VLM #74

faithfulnguyen opened this issue Feb 18, 2025 · 1 comment

Comments

@faithfulnguyen
Copy link

Hi thanks for sharing your work, I have a small dataset including images and descriptions, Can I use this code for training on my dataset?

@SumanthRH
Copy link
Collaborator

Hi!

Could you highlight which recipe you're trying to extend with image data? We've summarized all the recipes for the different models here:

Currently, the training code is made up of two forks :

  • LlamaFactory - Used for Sky-T1-32B-Preview and Sky-T1-32B-Flash. Llamafactory supports using image data.
  • VERL - Used for Sky-T1-mini. AFAIK VeRL is text-only, so you might have to customize the code for working with VLMs.

We're also actively working on cleaning up our training code.

Hope that helps!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants