Reinforcement Learning Basics python 3.7 gym 0.18.0 tensorflow 1.15.0 torch 1.7.1 Model-free (tabular setting) 1. Find Value Function Monte Carlo Temporal Difference (TD) 2. Find Optimal Policy Monte Carlo Control SARSA Q Learning Deep RL (non-tabular setting) 1. Value-Based DQN 2. Policy-Based (policy gradient) REINFORCE PPO 3. Actor-Critic (value-based + policy-based) TD Actor-Critic