DeepSpeed pipeline parallelization

DeepSpeed is a parallelization package for training large neural nets across multiple GPUs. One of their features is [pipeline parallelism,](https://deepspeed.readthedocs.io/en/latest/pipeline.html) which works by taking a PyTorch sequential nn object and splitting the sequential layers across different GPUs. Since Caskade is a nested, sequential code, I suspect we could modify it to enable such sequential evaluation, which would open the door to extremely large models and complex nested module structures. 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

DeepSpeed pipeline parallelization #55

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

DeepSpeed pipeline parallelization #55

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions