This is a python
project for experiment on RL and Q-learning
[o o <-(o)-> o o x] Target: control the robot <-( )-> until it is on 'x'
This is a version thet fit the standard framework of RL, Q-learning pseudocode seen at https://www.eecs.tufts.edu/~mguama01/post/q-learning/qlearning.png