IntroML-Fall2021

CSC 2515 Fall 2021: Introduction to Machine Learning

Machine learning (ML) is a set of techniques that allow computers to learn from data and experience, rather than requiring humans to specify the desired behaviour manually. ML has become increasingly central both in AI as an academic field, and in industry. This course introduces the main concepts and ideas in machine learning, and provides an overview of many commonly used machine learning algorithms. It also serves as a foundation for more advanced ML courses.

By the end of this course, the students will learn about (roughly categorized)

Machine Learning Problems: Supervised (regression and classification), Unsupervised (clustering, dimension reduction), Reinforcement Learning
Models: Linear and Nonlinear (Basis Expansion and Neural Networks)
Loss functions: Squared Loss, Cross Entropy, Hinge, Exponential, etc.
Regularizers: l1 and l2
Probabilistic viewpoint: Maximum Likelihood Estimation, Maximum A Posteriori, Bayesian inference
Bias and Variance Tradeoff
Ensemble methods: Bagging and Boosting
Optimization technique in ML: Gradient Descent and Stochastic Gradient Descent

The students are expected to learn the intuition behind many machine learning algorithms and the mathematics behind them. Through homework assignments, they will also learn how to implement these methods and use them to solve simple machine learning problems. More details can be found in syllabus (to be posted) and piazza.

Announcements:

Teaching Staff:

Instructor: Amir-massoud Farahmand
- Email: csc2515-2021-09@cs.toronto.edu
- Office Hour: Thursdays, 1-2PM (on Zoom)
TAs: Andrew Li, Ekansh Sharma, Marta Skreta, Zining Zhu
- Office hours: Check the schedule of homework assignments as they vary. (on Zoom)

Time & Location:

Lectures:
- Location: Room MS 2172 + Virtual (preferred to reduce chance of COVID)
- Time: Tuesdays, 11AM-1PM
Tutorials:
- Location: Virtual
- Time: Thursdays, 2-3PM

Lectures

This is the very tentative schedule, especially the parts related to tutorials, and it will change.

Note on videos: The videos will be publicly available on YouTube. If you don’t feel comfortable being recorded, make sure to turn off your camera when asking questions (though I really prefer to see all your faces when presenting a lecture).

Week (date)	Topics	Lectures	Tutorial materials	Suggested reading
1 (Sept 14)	Introduction to ML & K-Nearest Neighbours	slides video - part 1 video - part 2	Probability review slides & preliminaries Time: Thursday, Sept. 16, 4-5PM	ESL 1. ESL 2.(1-3), 2.5, 13.3
2 (Sept 21)	Decision Trees	slides video - part 1 video - part 2	Linear Algebra review slides	ESL 9.2
3 (Sept 28)	(Continued) Decision Trees Linear Models for Regression and Classification	slides video - part 1 video - part 2 video - part 3 video - part 4	Optimization and CV - notebook, worksheet (zip)	ESL 3.(1,2,4), 4.(1,2,4); PRML 4.3
4 (Oct 5)	(Continued) Linear Models for Regression and Classification		Gaussian distribution slides
5 (Oct 12)	(Continued) Linear Models for Regression and Classification Bias-Variance Decomposition and Ensembles: Bagging	slides video - part 1 video - part 2	Use OH and Tutorial (Oct 14, 1-3PM) to cover Bias-Variance	ESL 7.3, 8.7, 15.1-2
6 (Oct 19)	(Continued) Bias-Variance Decomposition and Ensembles: Bagging
7 (Oct 26)	Support Vector Machines & Kernel Methods & Ensembles: Boosting	slides video - part 1 video - part 2 video - part 3	Kernel methods	ESL 4.5.2, 12.(1-3), 10.(1-5)
8 (Nov 2)	(Continued) Support Vector Machines & Kernel Methods & Ensembles: Boosting Neural Networks	slides video - part 1 video - part 2		notes
9 (Nov 16)	(Continued) Neural Networks Probabilistic Models	slides video - part 1	NN with PyTorch Colab (blank) Colab slides with ConvNet	ESL 2.6.3, 6.6.3, 4.3.0
10 (Nov 23)	(Continued) Probabilistic Models Principal Component Analysis	slides	Use for lecture	ESL 14.5.1
11 (Nov 30)	(Continued) Principal Component Analysis K-Means Clustering	slides	Probabilistic Programming Fairness	PRML 9.1-2
12 (Dec 7)	Reinforcement Learning Statistical Learning Theory (could not fit; maybe post later)	slides	Policy Gradient	RL 3, 4.1, 4.4, 6.1-6.5

Note on slides: Many have contributed to the design of these slides. Credit goes to many members of the ML Group at the U of T, and beyond, including (recent past, as far as I know): Roger Grosse, Murat Erdogdu, Richard Zemel, Juan Felipe Carrasquilla, Emad Andrews, and myself.

Assignments and Courswork

These are the main components of the course. The details are described below. You need to use MarkUs to submit your solutions.

Note: This is tentative

Four homework assignments (40%): 10% each
Take-home Test (10%): Release date: Dec 4, Due date: Dec 6
Project (30%)
Reading Assignments (10%): Due date: Dec 10
Questions & Answers (10%)
Bonus (5%): Finding typos in the slides, active class participation, evaluating the class, etc.

Homework Assignments

This is a tentative schedule of the homework assignments. We plan to release them on Tuesday evenings and they will be due in 10 days (Monday of two weeks after), bit if we haven’t covered the topic of the homework yet, we postpone it accordingly.

Homework #	Out	Due	Materials	TA Office Hours
Homework 1	Sept 28, 23:59	Oct 8, 16:59	Questions Dataset	(1) Friday (Oct 1), 11AM-12PM (2) Wednesday (Oct 6), 1-2PM
Homework 2	Oct 14, 23:59	Oct 25, 16:59	Questions Helper code	(1) Monday (Oct 18), 11AM-12PM (2) Friday (Oct 22), 4-5PM
Homework 3	Nov 16, 23:59	Nov 26, 16:59	Questions Dataset	(1) Monday (Nov 22), 2-3PM (2) Wednesday (Nov 24), 2-3PM
Homework 4	Nov 30, 23:59	Dec 10, 16:59	Questions Dataset	(1) Monday (Dec 6), 11AM-12PM (2) Wednesday (Dec 8), 1-2PM

Research Project

Read the instruction carefully!

Proposal (5%): Nov 15
Project Report (25%): Dec 13

Note: If the number of teams is too many (more than 15-20) or we are late on the main content of the class, we may skip the presentation, and add this 5% to Project Report (making it 25%). Update: Since the number of students are very large, there will not be a presentation.

Reading Assignments

The following papers are a combination of seminal papers in ML, topics that we didn’t cover in lectures, or active research areas. You need to choose five (5) papers out of them, depending on your interest. We will post the papers as the course progresses (so check here often). Please read them and try to understand them as much as possible. It is not important that you completely understand a paper or go into detail of the proofs (if there is any), but you should put some effort into it.

After reading each paper:

You should summarize it in a short paragraph (100-200 words). Highlight the main points of the paper. Ignore the less interesting aspects.
Try to come up with one or two suggestions on how the method/idea described in the paper can be used or extended.

Note: that this is an incomplete and biased list. I have many favourite papers that are not included in this short list.

Ming Yuan and Yi Lin, “Model selection and estimation in regression with grouped variables,” Journal of Royal Statistical Society (B), 2006. PDF
Ali Rahimi and Recht, “Random Features for Large-Scale Kernel Machines,” NIPS, 2007. PDF
Leon Bottou and Olivier Bousquet, “The Tradeoffs of Large Scale Learning,” Advances in Neural Information Processing Systems (NeurIPS), 2007. PDF
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton, “Imagenet classification with deep convolutional neural networks,” Advances in Neural Information Processing Systems (NeurIPS), 2012. PDF
Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun, “Deep Residual Learning for Image Recognition,” CVPR, 2016. PDF
Volodymyr Mnih, Koray Kavukcuoglu, David Silver, et al., “Human-level control through deep reinforcement learning,” Nature, 2015. PDF
Martin Zinkevich, “Online Convex Programming and Generalized Infinitesimal Gradient Ascent,” ICML, 2003. PDF
Ian Goodfellow, Jonathan Shlens, and Christian Szegedy, “Explaining and Harnessing Adversarial Examples,” ICLR, 2015. PDF
Niranjan Srinivas, Andreas Krause, Sham Kakade, Matthias Seeger, “Gaussian Process Optimization in the Bandit Setting: No Regret and Experimental Design,” ICML, 2010. PDF
Moritz Hardt, Eric Price, and Nathan Srebro, “Equality of opportunity in supervised learning,” Advances in Neural Information Processing Systems (NeurIPS), 2016. PDF
Ronan Collobert and Jason Weston, “A Unified Architecture for Natural Language Processing: Deep Neural Networks with Multitask Learning,” International Conference on Machine Learning (ICML), 2008. PDF
Jeffrey Pennington, Richard Socher, and Christopher D. Manning, “GloVe: Global Vectors for Word Representation,” Empirical Methods in Natural Language Processing (EMNLP), 2014. PDF

We will post the papers as the course progresses.

Questions & Answers

The goal is to encourage you to reflect on the content of each lecture. You do this by writing down one or two questions based on the content of that lecture. You also need to write down your thoughts on the answer for those questions. You do not need to answer them successfully or completely. It is enough to show that you have seriously thought about them. You have until 5PM of the Monday after the lecture is finished to submit your Q&A.

Computing Resources

For the homework assignments, we will use Python, and libraries such as NumPy, SciPy, and scikit-learn. You have two options:

The easiest option is probably to install everything yourself on your own machine.
- If you don’t already have python, install it. We recommend using Anaconda. You can also install python directly if you know how.
- Optionally, create a virtual environment for this class and step into it. If you have a conda distribution run the following commands:
```
  conda create --name csc2515
  source activate csc2515
```
- Use pip to install the required packages pip install scipy numpy autograd matplotlib jupyter sklearn
All the required packages are already installed on the Teaching Labs machines.