CS 534: Machine Learning

Course Overview

Machine learning is an exciting field of computer science that impacts many applications both from a consumer standpoint (e.g., Microsoft Kinect, iPhone's Siri, Netflix recommendations) and the sciences and medicines (e.g., predicting genome-protein interactions, detecting tumors, personalized medicine). In this course, students will learng the fundamental theory and algorithms of machine learning and how to apply machine learning to solve problems.

Prerequisites: A course in linear algebra and previous exposure to statistics and probability theory. Homeworks and project will require programming ability in Python, Matlab, or R.


Schedule (Tentative)

#DateTopicReferencesAssignments
Introduction
1 8/23 Overview & Course Logistics
  • Homework 0 (due 8/27)
2 8/29 Crash Course in Optimization and Statistics
Supervised Learning I
3 8/31 Linear Regression
4 9/5
5 9/7 Naive Bayes & Linear Classification
  • Chapter 2.1 - 2.4 (Hastie et al.)
  • Chapter 4.1 - 4.4 (Hastie et al.)
  • Chapter 3.2 - 3.5 (Mitchell)
  • Homework 1 (due 9/22)
6 9/12
Model Assessment & Selection
7 9/14 Bias & Variance Tradeoff
  • Chapters 7, 8 (Hastie et al.)
8 9/19 Model Assessment
9 9/21 Bootstrap & Model Selection
  • Homework 2 (due 10/6)
10 9/26
Supervised Learning II
11 9/28 Boosting, Trees & Additive Models
  • Chapters 9, 10 (Hastie et al.)
  • Chapter 3 (Mitchell)
  • Cynthia Rudin's notes on decision trees
12 10/3
13 10/5 Ensembles & Random Forests
  • Chapters 15, 16 (Hastie et al.)
  • Breiman's paper on random forests
  • Homework 3 (due 10/20)
14 10/12 Support Vector Machines
15 10/17
16 10/19 Neural Networks
  • Chapter 11 (Hastie et al.)
  • Chapter 1-3 (Nielsen)
  • Homework 4 (due 11/3)
17 10/24
18 10/26 K Nearest Neighbors
  • Chapter 13 (Hastie et al.)
Unsupervised Learning
19 10/31 Dimensionality Reduction
20 11/2 Clustering & Mixture Models
Midterm
21 11/7 Midterm
22 11/9 Project Madness & TBD
  • Homework 5 (due 12/1)
Other Topics
23 11/14 Hidden Markov Models
24 11/16 Deep Learning
25 11/21
26 11/28 Topic Models
27 11/30 Recommendation Systems
Project Presentations
28 12/5 Presentations
29 12/7

Course Logistics

Piazza

All announcements, assignment clarifications, and slide corrections will be posted on Piazza. Make sure to check the site on a regular basis.

Office Hours

Joyce Ho: M 1:30 PM - 3:30 PM, W 9:30 AM - 12:00 PM @ MSC W414
Note: Office hours may change from time to time, in which case an announcement will be made on Piazza.

Textbook

  • Required: The Elements of Statistical Learning: Data Mining, Inference, and Prediction), by Trevor Hastie, Robert Tibshirani & Jerome Friedman
  • Supplemental: Machine Learning: a Probabilistic Perspective, by Kevin Murphy
  • Supplemental: Pattern Recognition and Machine Learning, by Christopher Bishop

Course Grading

ComponentWeight
Homeworks35.0%
Midterm17.5%
Project40.0%
Participation  7.5%

Project

You are encouraged to work in groups of 2-3 for the term project to analyze a real-world dataset. The goal is to either develop a novel algorithm (novelty bonus points will be given depending on the level of difficulty) or try various ML existing algorithms on the dataset. The project is a critical part of the course and a significant factor in determining your grade. Teams are required to hand in a project proposal, a final project report and prepare two presentations on their work.

By default, all team members will receive the same score for their project. If a team feels that this is unfair perhaps due to HIGHLY imbalanced contributions, then every team member needs to provide feedback on the contribution of each of the other team members via email before submission of the final report. After that I will need to have a meeting with all the members together to mediate.

More details on projects are posted on Piazza under the projects folder.

ComponentDue DateWeight
Proposal10/2515%
Madness11/9 (in class)10%
Presentation12/5-12/7 (in class)25%
Report12/1250%

Homework & Exam Policies

Homework:

  • Homework is due electronically on Gradescope/Canvas at 11:59 PM.
  • Each student receives 6 late days that can be used across the 5 homeworks throughout the semester. These late days extend the deadline for 24 hours.
  • A maximum of 3 late days can be used on a given homework.
  • Late days apply to the entire homework, so handing in one problem late counts as a late day towards the whole homework.
  • No credit will be given if you submit the homework late and have run out of late days.

Exam:

  • The midterm must be taken at the required time.
  • Requests for rescheduling the midterm exam will only be considered if the request is made at least a week prior to the exam date.

Honor Code

All class work is governed by the College Honor Code and Departmental Policy. It is acceptable and encouraged to discuss homeworks with other students. However, this should be noted on your submitted homework and all code and writeup must be written by yourself. Any code and writeup that is found to be similar is grounds for an honor code investigation by the Director of Gradute Studies, Laney Graduate School, and the honor council.