Machine learning (ML) is a set of techniques that allow computers to learn from data and experience, rather than requiring humans to specify the desired behaviour manually. This course focuses on Neural Networks (NN) models and the Deep Learning (DL) approach to design ML systems. NNs and DL have become increasingly central both in AI as an academic field, and in industry. This course gives an overview of both the foundational ideas and the recent advances in neural network algorithms.
By the end of this course, the students will learn about
The students are expected to learn the intuition behind many NN architectures and DL algorithms. Through homework assignments, they will also learn how to implement these methods and use them to solve machine learning problems.
More details can be found in syllabus and piazza.
Instructors
Lead TA: Claas Voelcker
TAs: TBA
Contact: csc413-2024-01@cs.toronto.edu
(Do not contact the instructors or TAs individually.)
Forum: Piazza
Submissions: MarkUs
Lecture | Time | Location | Instructor |
---|---|---|---|
LEC 0101/2101 | R1-4 | SF | Amir-massoud |
LEC 0201/2001 | T1-4 | MC | Amanjit |
LEC 5101/2501 | T6-9 | BA | Rupert |
Tutorials are in the third hour of each lecture. See ACORN for exact rooms.
This is a tentative schedule, especially the parts related to tutorials and the assignment.
Week | Date | Lecture Topic | Material | Suggested Reading |
---|---|---|---|---|
01 | Jan.08 | Introduction and Review of Linear Models | Slides | |
02 | Jan.15 | Multi-layer Feedforward NN and Backpropagation | Slides | |
03 | Jan.22 | Automatic differentiation, distributed representation, and GloVe | Slides | |
04 | Jan.29 | Convolutional Neural Networks | Slides | |
05 | Feb.05 | Optimization | Slides | Roger Grosse’s notes on Optimization, Christian Perone’s slides on Gradient-based Optimization, More on Momentum-based gradient descent, Distill: Why Momentum Really Works, Sebastian Ruder’s blog-post on gradient descent. |
06 | Feb.12 | More on Optimization & Generalization | Slides | Lilian Weng’s blog-post on the link between overfitting and generalization in deep learning. |
– | Feb.19 | (Reading Week) | ||
07 | Feb.26 | CNN Features Visualization and Adversarial Examples | Slides | Distill: Feature Visualization, Visualizing and Understanding Convolutional Networks, Sanity Checks for Saliency Maps. |
08 | Mar.04 | Recurrent Neural Networks | Slides | Jürgen Schmidhuber’s Tutorial on LSTM |
09 | Mar.11 | Attention Mechanism | Slides | Distill: Attention and Augmented Recurrent Neural Networks, The Annotated Transformer, see also repo: step-by-step implementation of Transformers for seq2seq tasks. |
10 | Mar.18 | Generative Models: (Variational) Autoencoders | Slides | Autoencoder Notebook, Deep Dive: An Introduction to Variational Autoencoders. |
11 | Mar.25 | Generative Adversarial Learning | Slides | GAN Notebook (html), Image-to-Image Translation in PyTorch, Chapter 26 of Probabilistic Machine Learning: Advanced Topics. |
12 | Apr.01 | Guest Speaker (Apr 2, 6PM) Mike Gimelfarb: Offline RL | Slides | |
12 | Apr.01 | |||
13 | Apr.08 | |||
14 | Apr.15 |
Note on slides: Many have contributed to the design of these slides. Credit goes to several members of the U of T, and beyond, including (recent past, as far as we know): Florian Shkurti, Igor Gilitschenski, Lisa Zhang, Jimmy Ba, Bo Wang, Roger Grosse, and us.
These are the main components of the course. The details are described below. You need to use MarkUs to submit your solutions (You need UTorID)
Item | Weight |
---|---|
Mathematical Homeworks * (2) | 8% (4% each) |
Programming Homeworks * (8) | 32% (4% each) |
Readings (5) | 10% (2% each) |
Project Proposal ** | 10% |
Project Report and Code ** | 20% |
Take-Home Test | 20% |
* Individual or in groups of two (2).
** In groups of three or four (3-4).
This is a tentative schedule of the homework assignments, and the exact date change according to the progress of the course. You have at least one week after the release date to submit your solutions.
Tutorial/Assignment | Post Date | Due Date | Handouts |
---|---|---|---|
Lab 1: Linear Models | Jan 16 | Jan 26 | Lab 01 |
Math 1: | Jan 16 | Jan 26 | Math HW 01 |
Lab 2: Multi-Layer Perceptrons with MedMNIST | Jan 23 | Feb 9 |
Lab 02 * |
Lab 3: Word Embeddings | Jan 30 | Feb 9 | Lab 03 * |
Project Proposal | Feb 12 |
Feb 27 |
Project (See Research Project) |
Lab 4: Differential Privacy | Feb 13 | Mar 1 | Lab 04 |
Math 2: | Feb 29 | Mar 11 | Math HW 02 |
Lab 5: Transfer Learning and Descent | Mar 05 | Mar 15 | Lab 05 |
Lab 6: GradCAM and Input Gradients | Mar 12 | Mar 22 | Lab 06 |
Paper Reading Assignments (first two) | Jan 19 | Mar 15 | |
Lab 7: Text Classification using RNNs | Mar 19 | Mar 30 | Lab 07 |
Paper Reading Assignments (last three) | Jan 19 | Apr 2 | |
Take-Home Test | Apr 2 | Apr 4 | PDF Posted on Quercus – \LaTeX |
Lab 8: Text Generation with Transformers | Mar 26 | Apr 5 | Lab 08 |
Project Report \& Code | Mar 26 | Apr 19 |
* These two handouts were flipped. Please submit “Word Embeddings” (lab02_sol.ipynb
) to Lab02 on MarkUs and “MLPs with MedMNIST” (lab03_sol.ipynb
) to Lab03 on MarkUs.
markus.teach.cs.toronto.edu/2024-01/
.There are ten homeworks worth 4% each: two (2) mathematical and eight (8) programming. These are to be completed individually or in groups of (2) students subject to the collaboration policy.
The final project component will take the form of a research paper. Students will propose (10%) and conduct an investigation and/or devise a method (20%) that leads to a report and codebase that could reasonably be submitted to an academic journal, conference or workshop. Details about format and length will be shared at a later date. Students are expected to work in a group of three or four (3-4) subject to the collaboration policy. The project proposal is due on Feb 27, and is primarily a way for us to give you feedback on your project idea. The final report and code is due April 19. More details can be found in the project handout. We have uploaded a sample proposal rubric for your reference to give some insight into how the proposal was evaluated for previous iterations of the course. However, we emphasize that this rubric is not going to be followed exactly for our course, and is only meant to give you some ideas on how to write an effective proposal. Similarly, here is a sample report rubric.
Sign-up is in the quercus calendar.
Project OH TA | Zoom link |
---|---|
Haonan (Simon) Duan | https://utoronto.zoom.us/j/82822970101 |
Umangi Jain | https://utoronto.zoom.us/j/84120671372 |
Amir Peimani | https://utoronto.zoom.us/j/86771705350 |
Asic Chen | https://utoronto.zoom.us/j/86886085935 |
Michael Cooper | https://utoronto.zoom.us/j/82970683470 |
In order to expose you to (and thus prepare you for) research and applications in deep learning, you are expected to read five (5) papers. Please read them and try to understand them as much as possible. It is not important that you completely understand a paper or go into detail of the proofs (if there is any), but you should put some effort into it.
After reading each paper, write a reflection that includes,
We recommend using the JMLR LaTeX style to write your reflection. These are to be completed individually, and without the assistance of AI tools.
There are two deadlines to submit to MarkUs.
The following papers (listed roughly chronologically) are a collection of seminal papers in DL, topics that we didn’t cover in lectures, or active research areas. You need to choose five (5) papers out of them, depending on your interest. We will post the papers as the course progresses (so check here often).
More papers may be added!
Note: This list only covers a tiny fraction of all great papers published in various ML/DL venues. Many of our favourites are not included.
This course does not have a required textbook, but the following resources can be useful:
If you need to brush up your basic knowledge of ML, you can take a look at one of the previous offerings of it at the U of T. For example, CSC2515 - Fall 2022 (the content is almost the same as CSC311).
Many of the deep learning success stories in the recent years rely on the advances of modern GPU computing. The programming assignments here are lightweight comparing to the state of the art deep learning models in terms of their computation requirement. But we highly recommend you to debug your models and to complete the experiments on a modern GPU. Here are the list of free resources you have access to:
For the homework assignments, we will use Python, and libraries such as NumPy, SciPy, and scikit-learn. You have two options:
virtualenv
or conda
to create an environment and install the required packages. For example:# if you are using conda
conda create --name csc413
source activate csc413
# if you are using virtualenv
virtualenv csc413
source csc413/bin/activate
# in both cases, install packages this way
pip install scipy numpy autograd matplotlib jupyter sklearn