Skip to content

Pull requests: aws-samples/awsome-distributed-training

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

Make FSDP example to use preinstalled aws-ofi-nccl
#539 opened Jan 30, 2025 by KeitaW Loading…
Update megatron lm testcase
#537 opened Jan 30, 2025 by KeitaW Draft
add tips to force NCCL comm to go through EFA
#531 opened Jan 23, 2025 by KeitaW Loading…
add os grafana stack
#526 opened Jan 16, 2025 by KeitaW Loading…
ec2 get metadata replacement
#515 opened Dec 10, 2024 by gmgtamz Loading…
easy smhp slurm and eks
#514 opened Dec 10, 2024 by gmgtamz Loading…
Update pcluster architecture guidance enhancement New feature or request
#464 opened Oct 23, 2024 by KeitaW Draft
add GPU accounting for SMHP
#462 opened Oct 21, 2024 by KeitaW Loading…
Update bionemo test case + propose to subdirectories per orchastrator documentation Improvements or additions to documentation
#396 opened Aug 5, 2024 by KeitaW Draft
Esm2 on Sagemaker Hyperpod
#387 opened Jul 25, 2024 by awsankur Loading…
update dependencies of PyTorch base image
#375 opened Jul 15, 2024 by KeitaW Loading…
Neuron distributed
#359 opened Jun 13, 2024 by KeitaW Loading…
End-to-End LLM Model Development with Torchtitan and Torchtune enhancement New feature or request
#341 opened May 20, 2024 by KeitaW Loading…
Llama training with FP8
#331 opened May 15, 2024 by pbelevich Draft
Add draft gpu troubles
#290 opened Apr 30, 2024 by mhuguesaws Draft
[WIP] torchtune usecase
#260 opened Apr 12, 2024 by pbelevich Draft
Bump pytorch dockerfile template
#211 opened Mar 12, 2024 by verdimrc Loading…
SMHP: slurm exporter to report gpu metrics
#181 opened Mar 6, 2024 by verdimrc Loading…
Update organization and tag to V1
#150 opened Feb 22, 2024 by perifaws Loading…
ProTip! Type g i on any issue or pull request to go back to the issue listing page.