Graph neural networks (GNNs) have achieved remarkable success with learning tasks on graph-structured data, yet challenges like oversquashing and oversmoothing limit their scalability to large graphs with long-range dependencies. This paper evaluates the Cluster Normalize Activate (CNA) architecture, designed to mitigate these issues, on the Long-Range Graph Benchmark (LRGB) datasets. Building on the results, we propose a novel architecture, Cluster Sample Pass (CSP), which integrates clustering, pooling, and probabilistic sampling to simplify graph structure while preserving critical information. CSP coarsens graphs by grouping nodes into clusters and sampling nodes from each cluster while retaining intra-cluster diversity. Subsequent GNN layers operate on the reduced graph, enabling scalable long-range dependency modeling. Experiments demonstrate that CSP improves computational efficiency and maintains competitive predictive performance, showing promise for addressing the limitations of current GNN architectures on complex graph tasks.
Follow these steps to set up a Conda environment and install the required dependencies for this project:
-
Clone the Repository
git clone https://github.com/yzimmermann/Cluster-Sample-Pass.git cd CSP -
Create the Conda Environment
conda create -n CSP python=3.9 -y conda activate CSP
-
Install Dependencies
Use the providedrequirements.txtfile to install the necessary libraries:pip install -r requirements.txt
-
Verify Installation
To ensure all dependencies are correctly installed, run:python -c "import torch; import torch_geometric; print('Environment setup successful!')" -
Run the Code
You can run any of the experiments mentioned in the paper by running themain.pyfile and referencing the desired config file.-
The Config Files
struct-GCN.yamlandfunc-GCN.yamlcontain the configs for reproducing the SOTA results for the Peptides-struct and Petides-func datasets. Positional encoding can be added by settingencoding: True, default is False.struct-coarsening-GCN.yamlandfunc-coarsening-GCN.yamlcontain the configs for our newly proposed architecture. Modify the config file to change the parameters of the model.
-
Running the Code
python main.py --cfg configs/<config_file_name>
Note: You can run only one data set at a time.
-
-
CNA Comparison: To run the CNA comparison have a look at the README: https://github.com/yzimmermann/Cluster-Sample-Pass/blob/main/CNA_comparison/README.md
