Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an event related to forward in the TrainerCallback #36496

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

yanadrdr
Copy link

@yanadrdr yanadrdr commented Mar 2, 2025

added an event "on_compute_loss" to TrainerCallback and CallbackHandler. I placed this event in the compute_loss method in Trainer, and allowed passing inputs and loss to it. The goal is to supply an event related with model forward (to track activation memory and details of inputs). Considered calling the event outside the compute_loss method, everywhere we do forawrd pass on the model, but seemed less generic and more mess.

What does this PR do?

This PR added an "on_compute_loss" event in the TrainerCallback and Callback handler. In my case - I use this event to log memory metrics so that I can follow activations consumption, but it can also be used to check stats of model inputs, and maybe other things too.

The motivation is discussed here:
#36012

@SunMarc

…er. I placed this event in the compute_loss method in Trainer, and allowed passing inputs and loss to it. The goal is to supply an event related with model forward (to track activation memory and details of inputs). Considered calling the event outside the compute_loss method, everywhere we do forawrd pass on the model, but seemed less generic and more mess.
@github-actions github-actions bot marked this pull request as draft March 2, 2025 14:11
Copy link

github-actions bot commented Mar 2, 2025

Hi 👋, thank you for opening this pull request! The pull request is converted to draft by default. When it is ready for review, please click the Ready for review button (at the bottom of the PR page).

@yanadrdr yanadrdr marked this pull request as ready for review March 2, 2025 15:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant