AI Agent 🤖 to navigate in a Banana 🍌 world 🌐

Introduction

This project aims to explore the power of teaching an agent through Reinforced Learning (RL) to navigate on a Banana World.

The agents uses a DQN Network with a Deep Q-Learning Aldorithm to learn how to navigate efficiently in the virtual world collecting bananas.

Options for different Networks

The implementation options for: Vanilla DQN, Double DQN, Dueling DQN and Priorized Replay Experience DQN.
Please check under Instructions on how to activate each of this options

Getting Started

You need to have installed the requirements (specially mlagents==0.4.0). Due to deprecated libraries, I've included a python folder which will help with installation of the system.
- Clone the repository: git clone https://github.com/joao-d-oliveira/RL-SmartAgent-BananaGame.git
- Go to python folder: cd RL-SmartAgent-BananaGame/python
- Compile and install needed libraries pip install .
Download the environment from one of the links below Download only the environment that matches your operating system:
- Linux: click here
- Mac OSX: click here
- Windows (32-bit): click here
- Windows (64-bit): click here
(For Windows users) Check out this link if you need help with determining if your computer is running a 32-bit version or 64-bit version of the Windows operating system.

(For AWS or Collab) If you'd like to train the agent on AWS (and have not enabled a virtual screen), then please use this link to obtain the environment.

2.1 In case you prefer to test the Visual Environment (where the states are defined by the video instead of a vector that indicates the state). please Download instead bellow:

Linux: click here
Mac OSX: click here
Windows (32-bit): click here
Windows (64-bit): click here

Place the downloaded file for your environment in the DRLND GitHub repository, in the RL-SmartAgent-BananaGame folder, and unzip (or decompress) the file.

Project Details

Rules of The Game

A reward of +1 is provided for collecting a yellow banana, and a reward of -1 is provided for collecting a blue banana. Thus, the goal of your agent is to collect as many yellow bananas as possible while avoiding blue bananas.

State Space

The state space has 37 dimensions and contains the agent's velocity, along with ray-based perception of objects around agent's forward direction. Given this information, the agent has to learn how to best select actions.

The state space of the "Visual" environment is composed by the snapshot of the video of the game, meaning that is an array composed by (84, 84, 3) which means, 84 of width and height and 3 channels (R.G.B.).

Action Space

Four discrete actions are available, corresponding to:

0 - move forward.
1 - move backward.
2 - turn left.
3 - turn right.

Conditions to consider solved

The task is episodic, and in order to solve the environment, your agent must get an average score of +13 over 100 consecutive episodes.

Instructions

Files

Code

agent.py - Agent class containing Q-Learning algorithm and all supoprt for Vanilla DQN, Double DQN, Dueling DQN and Priorized Replay Experience DQN.
model.py - DQN model class setup (containing configuration for Dueling DQN)
Navigation.ipynb - Jupyter Notebook for running experiment, with simple navigation (getting state space through vector)

agent_vision.py - Agent class containing Q-Learning algorithm Visual environment
model_vision.py - DQN model class setup for Visual environment
Navigation_Pixels.ipynb - Jupyter Notebook for running experiment, with pixel navigation (getting state space through pixeis)

Documentation

README.md - This file
Report.md - Detailed Report on the project

Models

All models are saved on the subfolder (models). For example, checkpoint.pt is a file which has been saved upon success of achieving the goal, and model.pt is the end model after runing all episodes.

Running Normal navigation with state space of `37` dimensions

Structure of Notebook

The structure of the notebook follows the following:

Initial Setup: (setup for parameters of experience, check report for more details)

Navigation
2.1 Start the Environment: (load environment for the game)
2.2 HelperFunctions: (functions to help the experience, such as Optuna, DQNsearch, ...)
2.3 Baseline DQN: (section to train an agent with the standard parameters, without searching for hyper-parameters)
2.4 Vanilla DQN: (section to train an agent with a Vanilla DQN)
2.5 Double DQN: (section to train an agent with a Double DQN)
2.6 Dueling DQN: (section to train an agent with a Dueling DQN)
2.7 Prioritized Experience Replay (PER) DQN: (section to train an agent with a PER DQN)
2.8 Double DQN with PER: (section to train an agent with a PER and Double DQN at same time)
2.9 Double with Dueling and PER DQN: (section to train an agent with a PER and Double and dueling DQN)
3.0 Plot all results: (section where all the results from above sections are plotted to compare performance)

Each of the sections: [2.3 Baseline DQN, 2.4 Vanilla DQN, 2.5 Double DQN, 2.6 Dueling DQN, 2.7 Prioritized Replay DQN, 2.8 Double DQN with PER, 2.9 Double with Dueling and PER DQN]

Have subsessions:

2.x.1 Find HyperParameters (Optuna)
2.x.1.1 Ploting Optuna Results
2.x.2 Run (network) DQN
2.x.3 Plot Scores

Each section relevant to the respective DQN.
You can choose whether to use the regular parameters, or try to find them through Optuna

Running

After fulling the requirements on section Getting Started and at requirements.txt 0. Load Jupyter notebook Navigation.ipynb

Adapt dictionary SETUP = { with the desired paramenters
Load the environment. Running sections:

1 Initial Setup
2.1 Start the Environment
2.2. Helper Functions
Then go the section of the Network you want to run [2.3 Baseline DQN, 2.4 Vanilla DQN, 2.5 Double DQN, 2.6 Dueling DQN, 2.7 Prioritized Replay DQN, 2.8 Double DQN with PER, 2.9 Double with Dueling and PER DQN] There you will be able to either run Optuna to find the theoretically best parameters, or run the model with the base paramenters.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AI Agent 🤖 to navigate in a Banana 🍌 world 🌐

Introduction

Options for different Networks

Getting Started

Project Details

Rules of The Game

State Space

Action Space

Conditions to consider solved

Instructions

Files

Code

Documentation

Models

Running Normal navigation with state space of `37` dimensions

Structure of Notebook

Running

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 76 Commits
images		images
models		models
python		python
saved_scores		saved_scores
.gitignore		.gitignore
Navigation.ipynb		Navigation.ipynb
Navigation_Pixels.ipynb		Navigation_Pixels.ipynb
README.md		README.md
Report.md		Report.md
agent.py		agent.py
agent_vision.py		agent_vision.py
model.py		model.py
model_vision.py		model_vision.py
requirements.txt		requirements.txt

joao-d-oliveira/RL-SmartAgent-BananaGame

Folders and files

Latest commit

History

Repository files navigation

AI Agent 🤖 to navigate in a Banana 🍌 world 🌐

Introduction

Options for different Networks

Getting Started

Project Details

Rules of The Game

State Space

Action Space

Conditions to consider solved

Instructions

Files

Code

Documentation

Models

Running Normal navigation with state space of 37 dimensions

Structure of Notebook

Running

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Running Normal navigation with state space of `37` dimensions

Packages