Skip to content

Commit aa3118a

Browse files
committed
dependencies
1 parent c27bdc6 commit aa3118a

File tree

2 files changed

+40
-5
lines changed

2 files changed

+40
-5
lines changed

README.md

Lines changed: 35 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,40 @@
11
# Plexus
22

3-
Plexus is a 3D parallel framework designed for large-scale distributed GNN training.
3+
Plexus is a 3D parallel framework for large-scale distributed GNN training.
4+
5+
## Dependencies
6+
7+
To use Plexus, you'll need the following dependencies:
8+
9+
* **Python 3.11.7:** It's recommended to use a virtual environment to manage your Python dependencies. You can create one using `venv`:
10+
11+
```bash
12+
python -m venv <your_env_name>
13+
source <your_env_name>/bin/activate
14+
```
15+
16+
* **CUDA 12.4:** If you'll be running Plexus on GPUs, you'll need CUDA 12.4. On systems where modules are used to manage software, you can load it with the following command (this is present in the run script under the examples directory for Perlmutter):
17+
18+
```bash
19+
module load cudatoolkit/12.4
20+
```
21+
22+
* **NCCL:** The NVIDIA Collective Communications Library (NCCL) is required for multi-GPU communication. On systems where modules are used (like Perlmutter), you can load it with:
23+
24+
```bash
25+
module load nccl
26+
```
27+
28+
* **Python Dependencies:** Once your virtual environment is set up, you can install the required Python packages using `pip` and the `requirements.txt` file provided in the repository:
29+
30+
```bash
31+
pip install -r requirements.txt
32+
```
433

534
## Directory Structure
635

7-
- **benchmarking**: Contains a serial implementation using PyTorch Geometric (PyG) for validation and testing. Additionally, it includes utilities for benchmarking Sparse Matrix-Matrix Multiplication (SpMM) operations, a key component in GNN computations.
8-
- **examples**: Offers a practical demonstration of how to leverage Plexus to parallelize a GNN model. This directory includes example scripts for running the parallelized training, as well as utilities for parsing the resulting performance data.
9-
- **performance**: Houses files dedicated to modeling the performance characteristics of parallel GNN training. This includes models for communication overhead, computation costs (specifically SpMM), and memory utilization.
10-
- **plexus**: Contains the core logic of the Plexus framework. This includes the parallel implementation of a Graph Convolutional Network (GCN) layer, along with utility functions for dataset preprocessing, efficient data loading, and other essential components for distributed GNN training.
36+
* **benchmarking**: Contains a serial implementation using PyTorch Geometric (PyG) for validation and testing. Additionally, it includes utilities for benchmarking Sparse Matrix-Matrix Multiplication (SpMM) operations, a key component in GNN computations.
37+
* **examples**: Offers a practical demonstration of how to leverage Plexus to parallelize a GNN model. This directory includes example scripts for running the parallelized training, as well as utilities for parsing the resulting performance data.
38+
* **performance**: Houses files dedicated to modeling the performance characteristics of parallel GNN training. This includes models for communication overhead, computation costs (specifically SpMM), and memory utilization.
39+
* **plexus**: Contains the core logic of the Plexus framework. This includes the parallel implementation of a Graph Convolutional Network (GCN) layer, along with utility functions for dataset preprocessing, efficient data loading, and other essential components for distributed GNN training.
40+

requirements.txt

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
axonn==0.2.0
2+
numpy==2.2.3
3+
ogb==1.3.6
4+
torch==2.6.0
5+
torch_geometric==2.6.1

0 commit comments

Comments
 (0)