Close

Course Syllabus

Course overview

Many of the most valuable companies in the world and the most innovative startups have business models based on data and AI, but our understanding about the economic value of data, networks and algorithmic assets remains at an early stage. For example, what is the value of a new dataset or an improved algorithm? How should practitioners understand and exploit the AI service market to obtain accurate and cheap predictions? How should investors value a data-centric business such as Netflix, Uber, Google, or Facebook? And what business models can best leverage data and algorithmic assets in settings as diverse as e-commerce, manufacturing, biotech and humanitarian organizations? In this graduate seminar, we will investigate these questions by studying recent research on these topics and by hosting in-depth discussions with experts from industry and academia. Key topics will include value of data quantity and quality in statistics and AI, AI service marketplaces, business models, social good and justice around data, economic theory, regulation around data, and emerging data protection regulations. We will have guest speakers from NASDAQ, Facebook, etc to talk about industrial view of data and AI. Students will also conduct a hands-on research project in group.

Location: Mondays and Wednesdays at 1:30pm-3pm in room 200-203. Please note that the first two weeks will be on Zoom -- join our meeting from the Zoom tab in the Canvas sidebar. 

Announcements and Questions

We're using Ed Discussion for all course communications. Access it from the sidebar in Canvas.

Prerequisites

This course will require sufficient mathematical maturity to follow the technical content; some familiarity with data mining and machine learning and at least an undergraduate course in statistics are recommended. However, the course will be accessible to a wide audience including graduate students in computer science, engineering, economics, law and business.

Instructors

Matei ZahariaLinks to an external site. (matei@cs.stanford.edu)

James Zou (Links to an external site.) (jamesz@stanford.edu

Steve EglashLinks to an external site. (seglash@stanford.edu)

TA

Alex Ke (Links to an external site.) (alexke@stanford.edu)

Scribes should be forwarded to the TA within one week after each lecture:

Office Hours

The instructors are all available to meet with you.  Please email the instructors if you'd like to meet. 

Assignments

Course project (65%): The main assignment is a quarter-long project, which could range from original research to a case study of a particular company or industry. We will ask students to form small groups and submit a project proposal in the first two weeks of the course, and we will then meet with each group to gauge their progress and provide advice. Each group will do interim and final presentations. Each group will also submit a final report (up to 8 pages).

Jan 21: 2-page proposals due (example 1, example 2)

Feb 2: Interim presentation (~5 mins)

Mar 7, 9: Final presentation (~10 mins)

Mar 14: Final project report

Class participation (20%): every student should read the assigned papers and actively engage in class discussions.

Class scribing (15%): students will be responsible for scribing one class. Good scribing should supplement the class discussion with additional readings. 

Schedule (Tentative)

Date Topic

Readings (required in red

1/3

Introduction, examples of data and business models

(Lecture slides, Lecture notes)

  1. AI and economics .
  2. AWS data exchange .
  3. FlatIron cancer data 
  4. How to read a paper
1/5

Guest Talk on NASDAQ Data Products: Brad Peterson (NASDAQ CTO/CIO) (Links to an external site.) and Bill Dague (Head of Alternative Data) (Links to an external site.)

(Lecture slides, Lecture notes)

1/10

Business value of data, Netflix case study

(Lecture slides, Lecture notes)

  1. Netflix recommender system.
  2. Data inverting
1/12

Case studies of ML applications and ways to evaluate ML impact (Lecture notes)

  1. Bandito (Links to an external site.)
  2.  (Links to an external site.)Workforce Applications (Links to an external site.)

1/17

MLK Day (No Lecture)

1/19

Design of data platforms, Databricks case study

(Lecture slides, Lecture notes)

  1. Evolution of decision support systems (up to page 18) (Ch.1 of Building the Data Warehouse (Links to an external site.))
  2. How to build an analytics team for impact in an organization
1/24

Data valuation (Lecture slides, Lecture notes)

  1. HAI article on data valuationLinks to an external site. (Links to an external site.) (Links to an external site.)
  2. Data Shapley. (Links to an external site.) 

1/26

Guest speaker: Joellen Russell (University of Arizona), Data and AI for Addressing Climate Change (Lecture notes)

NY Times article

Bronselaer_etal_2020_NatGeo.pdf 

Eyring Nature Climate Change 2019.pdf

Bronselaer Nature 2018.pdf  

1/31

ML as a service market (Lecture notes)

  1. FrugalMLLinks to an external site.
  2. Competition over dataLinks to an external site.

2/2

Student project interim presentations

 

2/7

Building data pipeline for AI. Guest speaker: Peter Hallinan (AWS) 

(Lecture slides, Lecture notes)

  1. Hidden Technical Debt in Machine Learning Systems (Links to an external site.)
  2.  (Links to an external site.)Data Validation for Machine Learning (Links to an external site.)

2/9

Data market panel with AWS, Snowflake, GCP, and Ticksmith. Speakers: Prasanna Krishnan, Noah Schwartz, Brian Welcker, Nicolas Doyen

(Lecture Notes)

2/14

Guest speaker Keith Winstein (Stanford) (Lecture Notes)

2/16

Business Models and Scaling Effects and Data + AI for health (Lecture slides (Steve), Lecture slides (James), Lecture notes)

  1. WeWork: Blitzscaling or Blitzflailing? (Links to an external site.)
  2. Reid Hoffman Shares Lessons (Links to an external site.)
  3. The fundamental problem with Silicon's Valley's favorite growth strategy (Links to an external site.)
  4. Response to The fundamental problem...

2/21

President's Day (No Lecture)
 

2/23

Virtual fireside chat: Reid Hoffman (founder, LinkedIn; partner, Greylock) entrepreneur and venture capitalist (Lecture notes)

2/28

Guest Speaker: David Engstrom. (Stanford Law School) (Lecture notes)
  1. Government by Algorithm
  2. The new judicial governance

3/2

Privacy, security, and regulations

(Lecture slidesLecture notes)

  1. GDPR.
  2. California Consumer Privacy Act.
  3. NYTimes cell phone tracking 

3/7

Student project final presentations 

 

3/9

Student project final presentations

 

Course Summary:

Date Details Due