The course EE-568 describes theory and methods for Reinforcement Learning (RL), which revolves around decision making under uncertainty. The course covers classic algorithms in RL as well as recent algorithms under the lens of contemporary optimization.

Lecture 1: Introduction to Reinforcement Learning. Definition of Markov Decision Processes, policy and performance criteria.

Lecture 2: Dynamic Programming Dynamic programming with known and unknown transition dynamics: Value Iteration, Policy Iteration, Q-Learning.

Lecture 3: Linear Programming Algorithms based on Primal and Dual Linear Programming formulation of RL: constraint sampling, REPS and DICE methods.

Lecture 4: Policy Gradient 1 Policy Parameterization, REINFORCE and techniques to compute unbiased estimator of the policy gradient.

Lecture 5: Policy Gradient 2 Non concavity of the policy gradient objective, global convergence of projected gradient descent, Global convergence of natural policy gradient, TRPO and PPO.

Lecture 6:  Imitation Learning Motivations, Setting, maximum causal entropy IRL, GAIL and LP approaches.

Lecture 7: Markov Games Motivations, Setting, different notions of equilibria, Policy Gradient algorithms for Zero Sum Games.

Lecture 8: Deep and Robust Reinforcement Learning Importance of robustness in RL, Robust RL as a Zero Sum Markov Game.