Week: 27 February - 5 March | Theory and Methods for Reinforcement Learning

Section outline

- Select activity Lecture 2: Dynamic Programming.
  
  Lecture 2: Dynamic Programming. File
  
  MDPs; value and Q functions; value iteration, policy iteration; operator perspectives. Model-free policy-based and value-based methods; Monte Carlo (MC) method and temporal difference (TD) learning.

Follow the pulses of EPFL on social networks