Archive 2019
PrérequisOptimisation M1. Statistique M1
ValidationCC+examen
EnseignantStéphane Gaïffas
Horaires hebdomadaires 3 h CM , 2 h TD
Années M2 Data Science (ouverture 2020)

Syllabus

Présentation des méthodes d'apprentissage supervisé et non-supervisé: des modèles génératifs aux réseaux de neurones en passant par les techniques arborescentes.

Sommaire

  1. Introduction to supervised learning (3 weeks). Binary classification, standard metrics and recipes (overfitting, cross-validation) and regression, LDA / QDA, Logistic regression, Generalized Linear Models, Regularization (Ridge, Lasso), Support Vector Machine, the Hinge loss. Kernel methods. Decision trees, CART, Boosting.
  2. Optimization for Machine Learning (2 weeks). Proximal gradient descent, Coordinate descent / coordinate gradient descent, Quasi-newton methods, Stochastic gradient descent and beyond
  3. Neural Networks (2 weeks). Introduction to neural networks. The perceptron, multilayer neural networks, deep learning. Adaptive-rate stochastic gradient descent, back-propagation. Convolutional neural networks
  4. Unsupervised learning (2 sessions). Gaussian mixtures and EM, Matrix Factorization, Non-negative Matrix Factorization, Factorization machines, Embeddings methods

Bibliographie

  • Murphy, K.M. (2012). Machine Learning. MIT Press.
  • Mohri, M., Rostamizadeh, A., and Talwalkar, A. (2012). Foundations of Machine Learning. MIT Press.
  • Goodfellow, I., Bengio, Y., and Courville, A. (2016). Deep Learning. MIT Press.
  • McKinney, W. (2012). Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython. O'Reilly.
  • Bühlmann, P., and van de Geer, S. (2011). Statistics for High-Dimensional Data: Methods, Theory and Applications. Springer-Verlag.