RelaySGD

Implementation of the decentralized learning algorithm RelaySGD¹ inside of Bagua² for my Bachelor Thesis.

You can run the benchmark using an installed version of bagua with:

python3 -m bagua.distributed.launch --nproc_per_node=<number of gpus> benchmark.py --algorithm relay

You can also provide some parameters:

python3 -m bagua.distributed.launch --nproc_per_node=<number of gpus> benchmark.py --algorithm relay --lr <learning rate> --alpha <data heterogeneity parameter> --topology <relay togology e.g. chain>

Experiment Evaluation

The logs folder contains the output of all the runs.

To tune the hyperparameters, modify and run the following scripts: hpt_relay.sh and hpt_rest.sh. The output is saved in the logs folder as summary*.txt. The final_run.sh script executes the below shown experiment using the best learning rates on 8 GPUs.

The second experiment evaluates the throughput of different algorithms. (synth_benchmark_run.sh)

Name		Name	Last commit message	Last commit date
Latest commit History 46 Commits
logs		logs
models		models
notebooks		notebooks
plots		plots
.gitignore		.gitignore
README.md		README.md
allreduce.py		allreduce.py
bachelor_thesis_samuel_bohl_final.pdf		bachelor_thesis_samuel_bohl_final.pdf
benchmark.py		benchmark.py
final_run.sh		final_run.sh
hpt_relay.sh		hpt_relay.sh
hpt_rest.sh		hpt_rest.sh
relay.py		relay.py
sampler.py		sampler.py
synth_bench_run.sh		synth_bench_run.sh
synthetic_benchmark.py		synthetic_benchmark.py
topologies.py		topologies.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

RelaySGD

Experiment Evaluation

CIFAR10 - VGG11

Comparing the decentraliced algorithm in bagua with RelaySGD

RelaySGD vs Allreduce

Comparing different topologies of RelaySGD

About

Uh oh!

Releases

Packages

Languages

samuelbohl/RelaySGD

Folders and files

Latest commit

History

Repository files navigation

RelaySGD

Experiment Evaluation

CIFAR10 - VGG11

Comparing the decentraliced algorithm in bagua with RelaySGD

RelaySGD vs Allreduce

Comparing different topologies of RelaySGD

Footnotes

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages