"Activation" checkpointing for modules

Activation checkpointing in PyTorch is a useful memory management technique which reduces the GPU memory usage on the backwards pass (it instead recalculates parts of the forward model on the fly). Blog post explaining more about this feature of PyTorch: https://medium.com/pytorch/how-activation-checkpointing-enables-scaling-up-training-deep-learning-models-7a93ae01ff2d

It would be neat to have this in Caskade as an option, where each module can have a checkpoint flag, which would make it so the entire module's forward call is recalculated on the fly during the backwards pass.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

"Activation" checkpointing for modules #54

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

"Activation" checkpointing for modules #54

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions