- This project is built during 'Deep Learning Jeju Camp' in 2018.
- This project is currently under construction.
The goal of this project is controlling multi-agents using graph neural networks. Multi-agent scenarios are usually sparsely rewarded. Graph neural networks have an advantage that each node can be trained robustly. With this property, we hypothesized that each agent in an environment can be controlled individually. Since there have been many research papers related to graph neural networks, we would like to apply it to reinforcement learning.
For the experiment, we will use Pommerman environment. This has relatively strict constraints on environment settings and simple to deploy algorithms.
- The proposed architectures is structured with two stages, generating graph and executing optimal actions.
- Inspired by the curiosity-driven paper, we use self-supervised prediction to infer environments. Taking previous states and actions, the first network is inferring the environment which can be generated to graph.
- Afterward, each agents execute the optimal actions based on the trained graph.
- The network design of prototype is shown below.
- Pommerman is sponsored by NVIDIA, FAIR, and Google AI.
- For each agent: 372 observation spaces (board, bomb_blast strength, bomb_life, position, blast strength, can kick, teammate, ammo, enemies) & 6 action spaces (stop, up, down, right, left, bomb)
- Free for all & Team match modes are available.
- Attach self-attention module at the graph generation
- Substitute execution stage to Nervenet
- Compare with random and heuristic agents
- Redraw network structure
- Prepare arXiv paper
- Graph Attention Networks
- Relational Deep Reinforcement Learning
- Nervenet: Learning structured policy with graph neural networks
- Curiosity-driven exploration by self-supervised prediction
- PlayGround: AI research into multi-agent learning
- Zero-shot task generalization with multi-task deep reinforcement learning
Apache