Graph Attention For Communication in MARL

This is an implementation of Communication in MARL using Graph Neural Network. This is been trained and tested on StarCraft II, And this has shown improved training and performance metrics throughout all the maps. I have implemented this on top of PyMARL for easier comparative study with respect to other algorithms or implementations like ePyMARL.

Currently we have the following algorithms for training.

QMIX: QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning
COMA: Counterfactual Multi-Agent Policy Gradients
VDN: Value-Decomposition Networks For Cooperative Multi-Agent Learning
IQL: Independent Q-Learning
QTRAN: QTRAN: Learning to Factorize with Transformation for Cooperative Multi-Agent Reinforcement Learning

For communication we have used to different Architecures

GConv: Graph Convolutional Network
GAT: Graph Attention Network

Graph Neural Network.

More Information about the architecture and the execution can be found at MultiAgent GNN A brief outline would be as follows

Pipeline for communication using Graph Neural Network

The implementation is written in PyTorch and uses a modified version of SMAC which could be found in smac-py to include the adjacency matrix as the observation more detail on it can be found here.

For a glimpse of the algorithm in action checkout the Output section

Installation instructions

I have used the default installation given in PyMARL, along which I have added a few changes to work with the latest version of pytorch (i.e., 1.10.0 at the time of documentation.) and added the requirements for the pytorch_geometric

-In the PyMARL repo the version of Cuda and the required version of Pytorch is very old
+The whole codebase is shifted to the latest torch 1.10.0 and cuda 11.3
+Hence custom installation would be better
-Use of the current Docker file is depreciated

Build the Dockerfile using

cd docker
bash build.sh

Set up StarCraft II and SMAC:

bash install_sc2.sh

Alternatively

After downloading SC2 follow the following steps

pip install -r requirements.txt
pip install -e smac-py

This will download SC2 into the 3rdparty folder and copy the maps necessary to run over.

Run an experiment

python3 src/main.py --config=qmix --env-config=sc2 with env_args.map_name=2s3z

The config files act as defaults for an algorithm or environment.

They are all located in src/config. --config refers to the config files in src/config/algs --env-config refers to the config files in src/config/envs

All results will be stored in the Results folder.

The previous config files used for the SMAC Beta have the suffix _beta.

Adjacency Matrix

An adjacency matrix simply represents the vertices of the graph. For the current problem we have used a few heuristics for joining two nodes with a vertex. They are as below

Communication distance : - Even though their is no restriction in communication having local communication improves cooperation in shared tasks
Unit Type : - Many task benefit from similar units perfoming certain part of the task than other other units cooperating with each other.

Results

Below are the training and test metric of the presented algorithm with QMIX on map 2s3z. The study is limited to the number of experiments due to limitation in computation at the disposal. The presented algorithm does support parallel envs and boosting the process of training. This would be tested soon

Train	--
Battle win percentage	Average Return

Test	--
Battle win percentage	Average Return

Output

This is a demo output from the policy whose stats are given above

Weights and the logs can be found here [Drive].

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Graph Attention For Communication in MARL

Graph Neural Network.

Installation instructions

Alternatively

Run an experiment

Adjacency Matrix

Results

Output

Files

README.md

Latest commit

History

README.md

File metadata and controls

Graph Attention For Communication in MARL

Graph Neural Network.

Installation instructions

Alternatively

Run an experiment

Adjacency Matrix

Results

Output