Theory and Methods for Reinforcement Learning
Weekly outline
-
-
Teaching Assistants:
- Luca Viano (Head TA)
- Leello Dadi
- Pedro Abranches
- Yongtao Wu
- Zhenyu Zhu
- Andrej Janchevski
-
-
An overview of the course
-
-
-
MDPs; value and Q functions; value iteration, policy iteration; operator perspectives. Model-free policy-based and value-based methods; Monte Carlo (MC) method and temporal difference (TD) learning.
-
-
-
Primal and Dual LP, ALP, ALP with constraint sampling, primal dual methods, REPS, offline LP methods.
-
-
-
Policy Gradients Methods.
-
-
-
Policy gradient methods II: NPG, Sample Based NPG, TRPO, exploration in policy gradients
-
Exercises on Value Iteration, Policy Iteration, Modified Policy Iteration and Q Learning
-
-
-
Imitation Learning: Behavioural cloning, Dagger, MCE-IRL, GAIL, P2IL, IQ-Learn
-
A brief description of the project (2 pages including references) which includes the following:
the names of the project team members
motivation of the projects
formal description of the problem and the goal
references
software and computational resources you will use
-
-
-
Markov Games
-
-
-
Actor Critic based Deep RL: TRPO, Soft Actor Critic.
Value based Deep RL: DQN, Double DQN, Rainbow.
Robust RL and IRL.
-
-
-
Dear all,
please upload for your final report by Thursday , Jun 1st at 11:59 PM.Please double-check the submission instructions that we uploaded on Moodle during the first week https://moodle.epfl.ch/pluginfile.php/3047530/mod_resource/content/5/syllabus-2023.pdf (page 3)
In particular, we expect between 6 and 8 pages in the NeurIPS template https://neurips.cc/Conferences/2022/PaperInformation/StyleFiles
The suggested structure is
- Abstract
- Introduction
- Related Work
- Approach
- Results
- Conclusion
- References
If you ran experiments, please attach your code as supplementary material, uploading a single zip file containing the main report in pdf format and a folder named supplementary for the attached files.
It is also possible to upload an Appendix in a separate pdf including it in the same zip file.
The final class is on June 1st when you will be giving a 15 minutes presentation of your project. There is no need to submit the slides you will use at this stage.
-