20 March - 26 March
Section outline
-
-
Policy gradient methods II: NPG, Sample Based NPG, TRPO, exploration in policy gradients
-
Exercises on Value Iteration, Policy Iteration, Modified Policy Iteration and Q Learning
-
