-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Description
DeepSpeed is a parallelization package for training large neural nets across multiple GPUs. One of their features is pipeline parallelism, which works by taking a PyTorch sequential nn object and splitting the sequential layers across different GPUs. Since Caskade is a nested, sequential code, I suspect we could modify it to enable such sequential evaluation, which would open the door to extremely large models and complex nested module structures.
Metadata
Metadata
Assignees
Labels
No labels