## Introduction

There are two kinds of methods in reinforcement learning, **tabular methods** and **approximate methods**. The purpose of RL is to get an **optimal policy**, which tells you to choose what action *A* when you at state *S*. If the state and aciton spaces are small enough, value fucntion can be represented as arrays, or tables. The problem with large state space is not just the memory needed for large tables, but the time and data needed to fill them accurately. If the state and aciton spaces are too large, due to the limitions of time and data, value functions need to be approximated with limited computational resources. In this case, out goal instead is to find a good enough approximate solution compared to optimal solution.