Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What is the recommended torch_dtype? #103

Open
CoCoNuTeK opened this issue Jun 9, 2024 · 4 comments
Open

What is the recommended torch_dtype? #103

CoCoNuTeK opened this issue Jun 9, 2024 · 4 comments
Labels
FAQ Frequently asked question

Comments

@CoCoNuTeK
Copy link

CoCoNuTeK commented Jun 9, 2024

Hello there,
what would you recommend as the best torch_dtype param??
Given the tradeoffs??
Or was the model trained only using the bfloat16??
Thanks for the answer.

@abdulfatir
Copy link
Contributor

@CoCoNuTeK The models were trained with tf32 (a 19-bit CUDA floating point format that's a replacement for fp32). We recommend bf16 for inference, especially if your machine supports that. It should require less memory and be much faster that fp32. Please note that we are talking about the model's parameters (torch_dtype in the pipeline) here. DO NOT cast your time series into bf16 as that may result in loss of information.

@abdulfatir abdulfatir changed the title What is the recommended torch dtype? What is the recommended torch_dtype? Jun 9, 2024
@abdulfatir abdulfatir added the FAQ Frequently asked question label Jun 9, 2024
@CoCoNuTeK
Copy link
Author

CoCoNuTeK commented Jun 9, 2024

@CoCoNuTeK The models were trained with tf32 (a 19-bit CUDA floating point format that's a replacement for fp32). We recommend bf16 for inference, especially if your machine supports that. It should require less memory and be much faster that fp32. Please note that we are talking about the model's parameters (torch_dtype in the pipeline) here. DO NOT cast your time series into bf16 as that may result in loss of information.

Ah, okay so i just keep my datapoints in format as they are, so if its stock data, i just feed them in as is, thanks for the info.
And for the finetuning part should I use bf16 aswell?

@abdulfatir
Copy link
Contributor

For fine-tuning, the recommended settings are in the training script which uses tf32 for training. Of course, you're free to experiment with other dtypes and hyperparameters.

P.S.: I don't want to constrain your creativity but please be mindful when applying a univariate pretrained model such as Chronos to stock data, which is often heavily influenced by external factors. :)

@CoCoNuTeK
Copy link
Author

For fine-tuning, the recommended settings are in the training script which uses tf32 for training. Of course, you're free to experiment with other dtypes and hyperparameters.

P.S.: I don't want to constrain your creativity but please be mindful when applying a univariate pretrained model such as Chronos to stock data, which is often heavily influenced by external factors. :)

I mean long term predictions for sure, but some day trading stuff could work if i try 1 tick = 5mins lets say it could find interesting stuff hopefully, i will let you know if you want.

@lostella lostella reopened this Jul 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
FAQ Frequently asked question
Projects
None yet
Development

No branches or pull requests

3 participants