Value of Data and AI

Value of Data and AI

Course overview

Many of the most valuable companies in the world and the most innovative startups have business models based on data and AI, but our understanding about the economic value of data, networks and algorithmic assets remains at an early stage. For example, what is the value of a new dataset or an improved algorithm? How should investors value a data-centric business such as Netflix, Uber, Google, or Facebook? And what business models can best leverage data and algorithmic assets in settings as diverse as e-commerce, manufacturing, biotech and humanitarian organizations? In this graduate seminar, we will investigate these questions by studying recent research on these topics and by hosting in-depth discussions with experts from industry and academia. Key topics will include value of data quantity and quality in statistics and AI, business models around data, networks, scaling effects, economic theory around data, and emerging data protection regulations. Students will also conduct a group research project in this field.

Prerequisites

This course will require sufficient mathematical maturity to follow the technical content; some familiarity with data mining and machine learning and at least an undergraduate course in statistics are recommended. However, the course will be accessible to a wide audience including graduate students in computer science, engineering, economics, law and business.

Instructors

Matei Zaharia (matei@cs.stanford.edu)

James Zou (jamesz@stanford.edu

Steve Eglash (seglash@stanford.edu)

TA

TBA

Office hours

TBA

Assignments

Course project (70%): The main assignment is a quarter-long project, which could range from original research to a case study of a particular company or industry. We will ask students to form small groups and submit a project proposal in the first two weeks of the course, and we will then meet with each group several times to gauge their progress and provide advice. Each group will do an in-class presentation at the end of the course, and possibly a mid-quarter presentation too. Each group will also submit a 6-page final report.

Class participation (20%): every student should read the assigned papers and actively engage in class discussions.

Class scribing (10%): students will be responsible for scribing one class. 

Schedule

Date Topic

Readings

1/8 Introduction, examples of data and business models

 

1/10 Business value of data

 

1/15 Basics of data platforms, means of collection, Databricks case study

 

1/17 Basics of statistics and ML; Netflix case study

 

1/22 More stats and ML; healthcare/biotech case study

 

1/24 Business models and scaling effects

 

1/29 Guest speaker:  Hal Varian (Chief Economist at Google), Google Ad Auction History

 

1/31 Data quality; case study

 

2/5 Guest speaker: Brad Peterson (CIO at NASDAQ), NASDAQ Data Products

 

2/7 Data valuation; case study

 

2/12 Guest speaker: economics of data

 

2/14 Consumer privacy and data security

 

2/19 ML accountability and fairness

 

2/21 Project update presentations

 

2/26 Regulations about data value and data sharing

 

2/28 Guest speaker: data vendor

 

3/4 Data marketplaces

 

3/6 How do VCs value data driven startups

 

3/11 Guest speaker: data ethics 

 

3/13 Wrap up: the future of data and AI

 

 

Course Summary:

Date Details