Thursday, June 6, 2024

Theoretical Foundations for Deep Learning

 

https://people.csail.mit.edu/moitra/408c.html

18.408 Theoretical Foundations for Deep Learning

Spring 2021




Deep learning has sparked a revolution across machine learning. It has led to major advancements in vision, speech, playing strategic games, and the sciences. And yet it remains largely a mystery. We do not understanding why the algorithms that we use work so well in practice.

In this class we will explore theoretical foundations for deep learning, emphasizing the following themes: (1) Approximation: What sorts of functions can be represented by deep networks, and does depth provably increase the expressive power? (2) Optimization: Essentially all optimization problems we want to solve in practice are non-convex. What frameworks can be used to analyze such problems? (3) Beyond-Worst Case Analysis: Deep networks can memorize worst-case data, so why do they generalize well on real-world data? For this and related questions, our starting point will often be natural data-generative models. The theory of deep learning is still very much a work-in-progress. Our goal in this course is merely to explain some of the key questions that drive the this area, and take a critical look at where the existing theory falls short.

We will cover topics such as: Barron's theorem, depth separations, landscape analysis, implicit regularization, neural tangent kernels, generalization bounds, data poisoning attacks and frameworks for proving lower bounds against deep learning.

Announcement: Homework 2 is now up, and is due May 7th. Please submit via gradescope.

Announcement: The final project description is up here. Please email me if you'd like to chat about potential project topics.

Announcement: You should join the course slack where we will post all lectures, notes, homeworks, etc.

Course Information

  • Instructor: Ankur Moitra

  • Harvard Sister Seminar: CS 229br: Advanced Topics in the Theory of Machine Learning, by Boaz Barak

  • Lectures: Wednesday 12-3, on Zoom

  • Prerequisites: A course in algorithms (6.046/18.410 or equivalent) and probability (6.041/18.440 or equivalent). You will need a strong background in algorithms, probability and linear algebra.

  • Assessment: Students will be expected to solve two problem sets, and complete a research-oriented final project. This could be either a survey, or original research; students will be encouraged to find connections between the course material and their own research interests.

Course Outline

Here is a tentative outline for the course:

    Approximation Theory
    • Universal Approximation and the Fourier Transform
    • Barron's Theorem
    • Depth Separations and Connections to Circuit Complexity

    Optimization Theory
    • Convex Optimization and Acceleration
    • Escaping Local Minima
    • Implicit Regularization
    • Neural Tangent Kernels

    Statistical Theory
    • Data-independent and Data-dependent Generalization Bounds
    • Kernel Ridge Regression
    • The Interpolation Regime and Double Descent

    Representation Learning
    • Dictionary Learning
    • Student-Teacher Networks
    • Lower Bounds from Cryptography

    Advanced Optimization Theory
    • The Mean Field View
    • Landscape Analysis and Matrix Completion
    • Langevin Dynamics and Metastability

Instructor Notes

  • Lecture 1: Universal Approximation and Barron's Theorem [pdf]
  • Lecture 2: Barron's Theorem (continued) and Depth Separations [pdf]
  • Lecture 3: Neural Tangent Kernels and Least Squares [pdf]
  • Lecture 4: Implicit Regularization and Mirror Descent [pdf]
  • Lecture 5: Generalization Bounds and PAC Learnability [pdf]
  • Lecture 6: Stability, Regularization and Privacy [pdf]
  • Lecture 7: Double Descent and Proper vs. Improper Learning [pdf]
  • Lecture 8: The Low Degree Algorithm, Intersections of Halfspaces and Statistical Queries [pdf]
  • Lecture 9: More Lower Bounds, Landscape Design and Filtered PCA [pdf]
  • Lecture 10: An Interlude [pdf]
  • Lecture 11: Adversarial Examples and Data Poisoning [pdf]
  • Lecture 12: GANs, Equilibria and Compressed Sensing [pdf]
  • Lecture 13: Deep Reinforcement Learning [pdf]

Problem Sets

Materials and Links

Here are links to some other resources you might find helpful. Some of the material will come from relevant lectures in these courses:

No comments:

Post a Comment