Course Syllabus

Course Description. This advanced graduate course explores in depth several important classes of algorithms in modern machine learning. We will focus on understanding the mathematical properties of these algorithms in order to gain deeper insights on when and why they perform well. We will also study applications of each algorithm on interesting, real-world settings. Topics include: spectral clustering, tensor decomposition, Hamiltonian Monte Carlo, adversarial training, and variational approximation. We will supplement the lectures with paper discussions and there will be a significant research project component to the class. Prerequisites: Probability ( CS 109), linear algebra ( Math 113), machine learning ( CS 229), and some coding experience.

Instructor: James Zou (

TA: Qijia Jiang (

Lectures: Mondays 2:30-4:20pm in 50-52H

Paper discussions: Fridays 1:30-2:20 in McCullough 122 (starting 4/14).

Office hours: James Mondays 4:30-6pm in Packard 253.

Piazza Page:

Assignments (see guidelines below):

Project: 50%  (team size flexible)

Scribing: 10%  (teams of 3-4)

Paper presentation: 20%  (teams of 3-4)

Participation (esp. in the paper discussions): 10% 

Assignment: 10%


Material & Relevant Reading (subject to change)

Paper Presentation


Week 1 (4/3): Random geometry in high dimensions and applications.

Material. Random geometry in high dimensions. Johnson-Lindenstrauss projections. Locality sensitive hashing

Relevant reading.

  1. Chapter 2.
  2. Charikar SimHash paper.

No Paper Discussion session this week.

Sign up for scribing and paper presentation. Link to sign-up sheet:

Week 2 (4/10): Spectral methods 1

Material. SVD as best low rank approximation. Tensor decomposition for mixture of Gaussians. Power method.

Relevant reading.


Spectral meta-learner.

Week 3 (4/17): Spectral methods 2

Material. Tensor decomposition for LDA and trees. Spectral clustering.  

Relevant reading.

  1. Moitra chapter 3.
  2. Spectral methods for dimensionality reduction.
  3. Tutorial on spectral clustering.
  4. Laplacian Eigenmap.

Tensor decomposition for signal processing and machine learning.

Form final project groups.

Week 4 (4/24): Sampling. Hamiltonian Monte Carlo.

Material. Review of MCMC. Hamiltonian Monte Carlo.

Relevant reading.

  1. HMC demos.
  2. Conceptual overview of HMC.
  3. Stochastic Langevin Dynamics (Teh’11)

Firefly Monte Carlo

Homework out 4/24.

Week 5 (5/1): Variational inference 1

Material. Basic variational inference and examples. Stochastic variational inference.

Relevant reading.

  1. Stochastic variational inference.
  2. VAE.
  3. normalizing flows.
  4. Streaming variational Bayes.

Black box variational inference.

Homework due 5/4 noon.

Week 6 (5/8): Variational inference 2

Material. Stochastic variational inference. Variational auto-encoder.

Semi-supervised learning with deep generative models.

Week 7 (5/15): Project proposal presentations

Project proposal presentation in class.

Learning from imbalanced data.

Week 8 (5/22): Robust ML 1: adversarial training

Material. Adversarial attacks.

Relevant reading. 

  1. Explaining and harnessing adversarial examples.

Domain adversarial training of neural networks.


Week 9: Holiday no class

Holiday, no class. 

No paper discussion this week. Sign up for individual project meetings with James. 

Project update meetings with James.

Week 10 (6/5): Robust ML 2: robust optimization, covariance shift.

Material. Robust optimization. Covariance shift.

Relevant reading.

  1. Covariate shift adaptation.
  2. Automated ML.

No paper discussion session this week.

Final Week (6/9): Final presentation

Oral presentation in class.

8 page NIPS style paper due on 6/12 11am.

Assignment guidelines.

Scribing: Each student should sign up to scribe one lecture here. Students scribing the same lecture should work together to produce one document using the latex template provided in Files. The latex files and PDF should be emailed to by Thursday noon of the week of the lecture. The document will be evaluated for clarity, comprehensiveness and accuracy (i.e. no typos). 

Paper presentationEach student should sign up to present one paper here. Students presenting the same paper should work together to prepare a 40 minutes whiteboard talk on the paper (at most two slides is allowed). The talk should be self-contained: give background/motivation, intuition and the key results from the paper that you think are the most interesting. Do not need to cover everything; present derivations if they convey insights. Clarity will be the main evaluation criterion. All students are expected to read the assigned paper and participate in the discussions. 

Homework assignment: Each student should submit one PDF solution. It's ok to help each other, but each person should complete his/her own assignment. 

Final project: This is the main component of the course; start early! This should be an research project that is related to the course material. You are free to select your own topic and work in teams, but please check in with the instructor. The project could be empirical (e.g. applying some of the methods we discuss to your data), theoretical (e.g. proving some algorithmic properties) or developing new methods. Please use the NIPS template provided for the final write-up.  

Course Summary:

Date Details Due