Related to Value Based Methods. Mathematics and ideas taken from Steven.
FUNDAMENTAL IDEA: Consider the value of any state as:
i.e.
The bellman equation relates the value of a current state with the value of successive states. Apparently the problem is divided in bellman expectation and bellman optimality.
- the expectation is used in Policy Evaluation which defines the expected value of a state relating to successor states.
We can write it recursively: