csc311

CSC 311 Spring 2020: Introduction to Machine Learning

Machine learning (ML) is a set of techniques that allow computers to learn from data and experience, rather than requiring humans to specify the desired behaviour manually. ML has become increasingly central both in AI as an academic field, and in industry. This course introduces the main concepts and ideas in machine learning, and provides an overview of many commonly used machine learning algorithms. It also serves as a foundation for more advanced ML courses.

By the end of this course, the students will learn about (roughly categorized)

The students are expected to learn the intuition behind many machine learning algorithms and the mathematics behind them. Through homework assignments, they will also learn how to implement these methods and use them to solve simple machine learning problems. More details can be found in syllabus and piazza.


Announcements:


Instructors:

Prof Amir-massoud Farahmand Emad A. M. Andrews
Email csc311-2020-01@cs.toronto.edu csc311-2020-01@cs.toronto.edu
Office hours Thursday, 13-14 at BA2283 Thursday, 20-22 at BA2283

Teaching Assistants:

Chunhao Chang, Rasa Hosseinzadeh, Julyan Keller-Baruch, Navid Korhani, Shun Liao, Ehsan Mehralian, Alexey Strokach, Jade Yu


Time & Location:

Section Room Lecture time Tutorial time
L0101 MS 2172 Tuesday 13-15 Thursday 14-15
L5101 BA 1130 Thursday 18-20 Thursday 20-21

Suggested Reading

No required textbooks. Suggested reading will be posted after each lecture (See lectures below).


Lectures

This is the tentative schedule, and it may change.

Week (date) Topics Lectures Tutorial materials Suggested reading
1
(Jan 6)
Introduction to ML & Nearest Neighbours slides Probability review slides & preliminaries ESL 1.
ESL 2.(1-3), 2.5
2
(Jan 13)
Decision Trees slides & Example Linear Algebra review slides ESL 9.2
3
(Jan 20)
Linear regression
Optimization
slides Optimization and CV slides notebook worksheet PRML 3.2
ESL 2.9, 8.7
ESL 3.2
4 (Jan 27) Bias-Variance Decomposition
Ensembles: Bagging
Linear Methods for Classification
slides No tutorial PRML 4.3
ESL 4.(1,2,4)
5 (Feb3) (Continued) Bias-Variance Decomposition
Ensembles: Bagging
Linear Methods for Classification
  Gaussian distribution; Bias-variance
Optimization for logistic regression slides
[notes]
6 (Feb 10) Multiclass Classification
Neural Networks
slides Midterm review slides notes
7 (Feb24) Support Vector Machines slides Invited Speaker, Terrance D’souza, 8:00pm at BA 1130 ESL 12.(1-3)
8 (Mar 2) Ensembles: Boosting slides Deep Learning with PyTorch (three modules covered) ESL 10.(1-5)
9 (Mar9) Probabilistic models slides More on SVM and Kernel Methods ESL 2.6.3, 6.6.3, 4.3.0
10 (Mar 16) (Continued) Probabilistic models   Videos:
Farahmand: MLE & Naive Bayes, Bayesian (Part 1), Bayesian (Part 2), GDA
Andrews: Probabilistic Models
PRML 12.1
11 (Mar 23) Principal Component Analysis slides Videos:
Farahmand: PCA (Part 1), PCA (Part 2)
Andrews: PCA
Matrix Factorizations & Recommender Systems
More on k-Means & EM Algorithm & EM demo on Kaggle
More on Recommender Systems & Team Dinosaur Planet
ESL 14.5.1
12 (Mar 30) k-Means & EM Algorithm
Reinforcement Learning
KMeans slides
RL slides
Videos:
Farahmand: Clustering (Part 1), Clustering (Part 2), RL (Part 1), RL (Part 2), RL (Part 3)
Andrews: Clustering
PRML 9.1-2
RL 3, 4.1, 4.4, 6.1-6.5

Homework Assignments

This is a tentative schedule of the homework assignments. Most of them will be released on Thursday evening and they will be due in 10 days. Please see the course information handout for detailed policies (marking, lateness, etc.).

Homework # Out Due Materials TA Office Hours
Homework 1 Jan 23, 23:59 Feb 3, 16:59 Questions Dataset Jan 28 (Tuesday), 4-6PM at BA3289
Jan 31 (Friday), 3-5PM at BA3289
Homework 2 Feb 6, 23:59 Feb 17, 16:59 Questions Dataset Feb 11 (Tuesday), 4:30-6:30PM at BA3289
Feb 14 (Friday), 3-5PM at BA3289
Homework 3 Mar 5, 23:59 Mar 16, 16:59 Questions Dataset Mar 10 (Tuesday), 4-6PM at BA3289
Mar 13 (Friday), 11:30AM-1PM at BA4290
Homework 4 Mar 23, 23:59 Apr 3, 16:59 Questions Dataset Mar 27 (Friday), 2-4PM at virtual office
Mar 31 (Tuesday), 4-6PM at virtual office

Reading Assignments

There will be five (5) reading assignments. These are selected from seminal papers in machine learning or closely related fields such as statistics. These papers complement the topics we cover in the course, or show you how a research paper in ML is written. You have to read them and try to understand as much as possible. Some papers are easy and some may be difficult. It is not important that you completely understand a paper, but you should put some effort into it.

We ask you to summarize each paper in a short paragraph (100-200 words) and try to come up with two suggestions on how the method described in the paper can be used or extended. These five assignments contribute 10% to your final mark.

The reading assignments are only lightly evaluated: You should submit your summary before April 10th (Friday) at 5PM. We randomly check some of your summaries to see whether or not they are in fact relevant to the assigned paper. If they are, you get the points. If they are not, and you submitted a filler, you will get only 0% (even if it is only one of your submissions).

We will post the papers as the course progresses.


Computing Resources

For the homework assignments, we will use Python, and libraries such as NumPy, SciPy, and scikit-learn. You have two options: