20 March - 26 March
Résumé de section
-
-
Policy gradient methods II: NPG, Sample Based NPG, TRPO, exploration in policy gradients
-
Exercises on Value Iteration, Policy Iteration, Modified Policy Iteration and Q Learning
-
