RL A project in 5 parts to learn about Reinforcement Learning Random agent Q-learning Epsilon-greedy algorithm Hyperparameters and Environment Dynamics Escaping a sub-optimal policy