This repository contains scripts for network simulation using the CODES discrete-event simulation framework. Specifically, all scripts here run the binary model-net-mpi-replay which simulates the behaviour of one or multiple jobs running on an HPC network.
individual-scripts/- scripts to run one experiment at the timempi_replay/- Python 3.13 script to run a battery of testsvisualizing_jobs/- Visualizes each iteration time for all jobs in an experiment
Each experiment directory contains:
conf/- Configuration files for network topology and simulation parametersresults/- Output from simulation runs (logs, statistics, performance data)- A python or shell script to run experiments with specific parameters
Feel free to copy individual scripts (or their entire subdirectory) and modify them to run new scenarios. Within mpi_replay/, you should be able to make a copy of run_mpi_surrogacy_experiments.py to run a series of experiments.
individual-scripts/dfly-1056/- Dragonfly topology experiments with 1,056 nodesindividual-scripts/dfly-72/- Dragonfly topology experiments with 72 nodesindividual-scripts/dfly-8448/- Dragonfly topology experiments with 8,448 nodesindividual-scripts/torus-64/- Torus topology experiments with 64 nodes
Once you have got some results, you can visualize how long each iteration took. Simply run:
python3 visualizing_jobs/print-iterations.py path-to/results/exp-XXX/experiment-name/iteration-logs/For command line help, run python3 visualizing_jobs/print-iterations.py --help.
Run experiments using the provided script (it assumes you have compiled CODES using the CODES-compile-instructions.sh script and have downloaded this repo under the same directory that script resides. Please check the script CODES-compile-instructions.sh at https://github.com/codes-org/codes):
bash run-experiment.sh path-to-experiment/script.sh
# or in the case of mpi_replay
bash run-experiment.sh mpi_replay/run_mpi_surrogacy_experiments.py
# or in case you want to pass arguments to your experiment script, you can simply
bash run-experiment.sh path-to-experiment/script.sh --argument some-file.txt --other-argResults are automatically stored in path-to-experiment/results/.
In case you want to run an experiment with sbatch, you can use the script run-sbatch.sh instead of run-experiment.sh. The run-sbatch.sh script will run the experiments in a different folder to that of the script. This is because in systems where sbatch is needed, one often stores data in a different folder than the folder one is running the script. Under this new folder, the script will create a results/ subfolder just as run-experiment.sh does.
Requires the CODES simulation framework to be built and configured. It currently works with commit version @73cdbd5 of CODES.