Learning theory
Aperçu des semaines

This master class on learning theory covers the classical PAC framework for learning, stochastic gradient descent, tensor methods. We also touch upon topics from the recent literature on mean field methods for NN and the double descent phenomenon.
Teacher: Nicolas Macris: nicolas.macris@epfl.ch  with also some lectures by Rodrigo Veiga: rodrigo.veiga@epfl.ch
Teaching Assitant: Anand Jerry Georges  anand.george@epfl.ch
Courses: Mondays 8h1510h in presence Room INM202; Exercises: Tuesdays 17h1519h in presence Room INR219.
We will use this moodle page to distribute homeworks, solutions and also collect graded ones. As well as use the discussion and questions forum. Dont hesitate to actively use this forum.
Lectures are in presence. If you miss a lecture an old recorded version is accessible here https://mediaspace.epfl.ch/channel/CS526+Learning+theory/29761 however the material, instructors and order of lectures might be slightly different this year
EXAM: its open book. You can bring your notes, printed material, book(s). But no electronic material!
Textbooks and notes:
 Understanding Machine Learning (UML) by ShalevShwartz and Ben David
 Bayesian Reasoning and Machine Learning by David Barber(Cambridge)
 Pattern recognition and Machine Learning by Christopher Bishop (Springer)
 Introduction to Tensor Decompositions and their Applications in Machine Learning (Ranbaser, Shchur, Gunneman)
 One lecture on twolayer neural networks

If you have a question or want to start a discussion on a topic, post here
 Understanding Machine Learning (UML) by ShalevShwartz and Ben David

PAC learning framework. Finite classes. Uniform convergence.
See chapters 3 and 4 in UML
Homework 1: exercises 1, 3, 7, 8 of Chapter 3.

No free lunch theorem.
See chapter 5 in UML
Homework 2: exercises 1 and 2 of chapter 4

Learning infinite classes I
Chapter 6 in UML
Graded hmw 3 due date Monday 18 March 23h59

Learning infinite classes II (VC dimension)
Chapter 6 continued
graded hmw 3 continued due date monday 18 March 23h59

Bias variance tradeoff and the double descent phenomenon
We will study the double descent of generalization error based on the paper "Two models of double descent for weak features" by Belkin, Hsu, Xu

Double descent phenomenon: continuation and derivation for weak features model
The derivations use the notion of MoorePenrose inverse which is fully reviewed as a problem and solution in the two files attached below.

Easter week break

Gradient descent (convexity, Lipshitzness, Approach to optimal solution)
Stochastic gradient descent, application to learning
Chapter 14 in UML
Graded hmw 6 due date monday 21th April at 23h59

Mean field approach for two layer neural networks
based on the paper "One lecture on two layer neural networks" by A. Montanari
graded hmw 6 continued (due monday 22nd April at 23h59)

First hour: I will finish discussing the main idea of the mean analysis of two layer neural networks.
Second hour: we start the Tensor analysis.
Tensors 1. Motivations and examples, multidimensional arrays, tensor product, tensor rank. 
Tensors 2. Tensor decompositions and rank, Jennrich's theorem
Graded hmw 8 due date 13 may 23h59

Tensors 3. Matricizations and Alternating Least Squares algorithm

Tensors 4. Multilinear rank Tucker higher order singular value decomposition
Graded hmw 10 due date 27th may at 23h59

Monday 20 holiday
Tuesday 21: exercise session  Q&A.
The hmw 11 below is an extra hmw that reviews the tensor whitening process

Tensors 5. Power method and Applications: Gaussian Mixture Models, Topic models of documents
