Course Syllabus

Course Description. This advanced graduate course explores in depth several important classes of algorithms in modern machine learning. We will focus on understanding the mathematical properties of these algorithms in order to gain deeper insights on when and why they perform well. We will also study applications of each algorithm on interesting, real-world settings. Topics include: spectral clustering, tensor decomposition, Hamiltonian Monte Carlo, adversarial training, and variational approximation. We will supplement the lectures with paper discussions and there will be a significant research project component to the class. Prerequisites: Probability ( CS 109), linear algebra ( Math 113), machine learning ( CS 229), and some coding experience.

Instructor: James Zou (jamesz@stanford.edu)

TA: Qijia Jiang (qjiang2@stanford.edu)

Lectures: Mondays 2:30-4:20pm in 50-52H

Paper discussions: Fridays 1:30-2:20 in McCullough 122 (starting 4/14).

Office hours: James Mondays 4:30-6pm in Packard 253.

Piazza Page: piazza.com/stanford/spring2017/cs329m/home

Assignments (see guidelines below):

Project: 50%  (team size flexible)

Scribing: 10%  (teams of 3-4)

Paper presentation: 20%  (teams of 3-4)

Participation (esp. in the paper discussions): 10% 

Assignment: 10%

Week

Material & Relevant Reading (subject to change)

Paper Presentation

Assignments

Week 1 (4/3): Random geometry in high dimensions and applications.

Material. Random geometry in high dimensions. Johnson-Lindenstrauss projections. Locality sensitive hashing

Relevant reading.

  1. http://www.cs.cornell.edu/jeh/bookMay2015.pdf Chapter 2.
  2. Charikar SimHash paper. https://www.cs.princeton.edu/courses/archive/spr04/cos598B/bib/CharikarEstim.pdf

No Paper Discussion session this week.

Sign up for scribing and paper presentation. Link to sign-up sheet: https://docs.google.com/spreadsheets/d/1FTjZs5QSzCV5Fzk3OybpJfBgYXGmJhRyexMD_BgCSbc/edit#gid=0

Week 2 (4/10): Spectral methods 1

Material. SVD as best low rank approximation. Tensor decomposition for mixture of Gaussians. Power method.

Relevant reading.

  1. http://www.offconvex.org/2015/12/17/tensor-decompositions/
  2. https://arxiv.org/pdf/1210.7559.pdf

Spectral meta-learner. http://www.pnas.org/content/111/4/1253.full.pdf?with-ds=yes

Week 3 (4/17): Spectral methods 2

Material. Tensor decomposition for LDA and trees. Spectral clustering.  

Relevant reading.

  1. Moitra chapter 3. http://people.csail.mit.edu/moitra/docs/bookex.pdf
  2. Spectral methods for dimensionality reduction. https://cseweb.ucsd.edu/~saul/papers/smdr_ssl05.pdf.
  3. Tutorial on spectral clustering. http://www.kyb.mpg.de/fileadmin/user_upload/files/publications/attachments/luxburg06_TR_v2_4139%5b1%5d.pdf
  4. Laplacian Eigenmap. http://yeolab.weebly.com/uploads/2/5/5/0/25509700/belkin_laplacian_2003.pdf

Tensor decomposition for signal processing and machine learning. https://arxiv.org/pdf/1607.01668v2.pdf

Form final project groups.

Week 4 (4/24): Sampling. Hamiltonian Monte Carlo.

Material. Review of MCMC. Hamiltonian Monte Carlo.

Relevant reading.

  1. HMC demos. https://arogozhnikov.github.io/2016/12/19/markov_chain_monte_carlo.html
  2. Conceptual overview of HMC. https://arxiv.org/pdf/1701.02434.pdf.
  3. Stochastic Langevin Dynamics (Teh’11)

Firefly Monte Carlo https://arxiv.org/pdf/1403.5693.pdf https://github.com/HIPS/firefly-monte-carlo

Homework out 4/24.

Week 5 (5/1): Variational inference 1

Material. Basic variational inference and examples. Stochastic variational inference.

Relevant reading.

  1. Stochastic variational inference. http://www.columbia.edu/~jwp2128/Papers/HoffmanBleiWangPaisley2013.pdf
  2. VAE. https://arxiv.org/abs/1312.6114
  3. normalizing flows. http://jmlr.org/proceedings/papers/v37/rezende15.pdf
  4. Streaming variational Bayes. https://papers.nips.cc/paper/4980-streaming-variational-bayes.pdf

Black box variational inference. http://www.cs.columbia.edu/~blei/papers/RanganathGerrishBlei2014.pdf

Homework due 5/4 noon.

Week 6 (5/8): Variational inference 2

Material. Stochastic variational inference. Variational auto-encoder.

Semi-supervised learning with deep generative models. http://papers.nips.cc/paper/5352-semi-supervised-learning-with-deep-generative-models.pdf.

Week 7 (5/15): Project proposal presentations

Project proposal presentation in class.

Learning from imbalanced data. http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=5128907

Week 8 (5/22): Robust ML 1: adversarial training

Material. Adversarial attacks.

Relevant reading. 

  1. Explaining and harnessing adversarial examples. https://arxiv.org/pdf/1412.6572.pdf
  2. https://blog.openai.com/adversarial-example-research/

Domain adversarial training of neural networks. http://jmlr.org/papers/volume17/15-239/15-239.pdf

 

Week 9: Holiday no class

Holiday, no class. 

No paper discussion this week. Sign up for individual project meetings with James. 

Project update meetings with James.

Week 10 (6/5): Robust ML 2: robust optimization, covariance shift.

Material. Robust optimization. Covariance shift.

Relevant reading.

  1. Covariate shift adaptation. http://www.jmlr.org/papers/volume8/sugiyama07a/sugiyama07a.pdf
  2. Automated ML. https://papers.nips.cc/paper/5872-efficient-and-robust-automated-machine-learning.pdf

No paper discussion session this week.

Final Week (6/9): Final presentation

Oral presentation in class.

8 page NIPS style paper due on 6/12 11am.

Assignment guidelines.

Scribing: Each student should sign up to scribe one lecture here. Students scribing the same lecture should work together to produce one document using the latex template provided in Files. The latex files and PDF should be emailed to qjiang2@stanford.edu by Thursday noon of the week of the lecture. The document will be evaluated for clarity, comprehensiveness and accuracy (i.e. no typos). 

Paper presentationEach student should sign up to present one paper here. Students presenting the same paper should work together to prepare a 40 minutes whiteboard talk on the paper (at most two slides is allowed). The talk should be self-contained: give background/motivation, intuition and the key results from the paper that you think are the most interesting. Do not need to cover everything; present derivations if they convey insights. Clarity will be the main evaluation criterion. All students are expected to read the assigned paper and participate in the discussions. 

Homework assignment: Each student should submit one PDF solution. It's ok to help each other, but each person should complete his/her own assignment. 

Final project: This is the main component of the course; start early! This should be an research project that is related to the course material. You are free to select your own topic and work in teams, but please check in with the instructor. The project could be empirical (e.g. applying some of the methods we discuss to your data), theoretical (e.g. proving some algorithmic properties) or developing new methods. Please use the NIPS template provided for the final write-up.  

Course Summary:

Date Details Due