NN-Winter2024

CSC 413 Winter 2024: Neural Networks and Deep Learning

Machine learning (ML) is a set of techniques that allow computers to learn from data and experience, rather than requiring humans to specify the desired behaviour manually. This course focuses on Neural Networks (NN) models and the Deep Learning (DL) approach to design ML systems. NNs and DL have become increasingly central both in AI as an academic field, and in industry. This course gives an overview of both the foundational ideas and the recent advances in neural network algorithms.

By the end of this course, the students will learn about

Neural Network Architectures such as Feedforward NN, Convolutional NN, Recurrent NN, and Transformers
Training and Optimization: Backpropagation, SGD, etc.
Sequence and language modelling
Transfer Learning
Adversarial Attacks
Generative models such as Generative Adversarial Networks

The students are expected to learn the intuition behind many NN architectures and DL algorithms. Through homework assignments, they will also learn how to implement these methods and use them to solve machine learning problems.

More details can be found in syllabus and piazza.

Announcements:

Teaching Staff

Instructors

Amir-massoud Farahmand
- Office Hour: M1-2 @ Zoom
Amanjit Singh Kainth
- Office Hour: T11-12 @ BA 2272
Robert (Rupert) Wu
- Office Hour: T4-5 @ BA 2272

Lead TA: Claas Voelcker

TAs: TBA

Contact: csc413-2024-01@cs.toronto.edu (Do not contact the instructors or TAs individually.)

Forum: Piazza

Submissions: MarkUs

Lectures

Lecture	Time	Location	Instructor
LEC 0101/2101	R1-4	SF	Amir-massoud
LEC 0201/2001	T1-4	MC	Amanjit
LEC 5101/2501	T6-9	BA	Rupert

Tutorials are in the third hour of each lecture. See ACORN for exact rooms.

Schedule and Links

This is a tentative schedule, especially the parts related to tutorials and the assignment.

Week	Date	Lecture Topic	Material	Suggested Reading
01	Jan.08	Introduction and Review of Linear Models	Slides
02	Jan.15	Multi-layer Feedforward NN and Backpropagation	Slides
03	Jan.22	Automatic differentiation, distributed representation, and GloVe	Slides
04	Jan.29	Convolutional Neural Networks	Slides
05	Feb.05	Optimization	Slides	Roger Grosse’s notes on Optimization, Christian Perone’s slides on Gradient-based Optimization, More on Momentum-based gradient descent, Distill: Why Momentum Really Works, Sebastian Ruder’s blog-post on gradient descent.
06	Feb.12	More on Optimization & Generalization	Slides	Lilian Weng’s blog-post on the link between overfitting and generalization in deep learning.
–	Feb.19	(Reading Week)
07	Feb.26	CNN Features Visualization and Adversarial Examples	Slides	Distill: Feature Visualization, Visualizing and Understanding Convolutional Networks, Sanity Checks for Saliency Maps.
08	Mar.04	Recurrent Neural Networks	Slides	Jürgen Schmidhuber’s Tutorial on LSTM
09	Mar.11	Attention Mechanism	Slides	Distill: Attention and Augmented Recurrent Neural Networks, The Annotated Transformer, see also repo: step-by-step implementation of Transformers for seq2seq tasks.
10	Mar.18	Generative Models: (Variational) Autoencoders	Slides	Autoencoder Notebook, Deep Dive: An Introduction to Variational Autoencoders.
11	Mar.25	Generative Adversarial Learning	Slides	GAN Notebook (html), Image-to-Image Translation in PyTorch, Chapter 26 of Probabilistic Machine Learning: Advanced Topics.
12	Apr.01	Guest Speaker (Apr 2, 6PM) Mike Gimelfarb: Offline RL	Slides
12	Apr.01
13	Apr.08
14	Apr.15

Note on slides: Many have contributed to the design of these slides. Credit goes to several members of the U of T, and beyond, including (recent past, as far as we know): Florian Shkurti, Igor Gilitschenski, Lisa Zhang, Jimmy Ba, Bo Wang, Roger Grosse, and us.

Deliverables and Evaluation Scheme

These are the main components of the course. The details are described below. You need to use MarkUs to submit your solutions (You need UTorID)

Item	Weight
Mathematical Homeworks * (2)	8% (4% each)
Programming Homeworks * (8)	32% (4% each)
Readings (5)	10% (2% each)
Project Proposal **	10%
Project Report and Code **	20%
Take-Home Test	20%

* Individual or in groups of two (2).

** In groups of three or four (3-4).

Due Dates Schedule

This is a tentative schedule of the homework assignments, and the exact date change according to the progress of the course. You have at least one week after the release date to submit your solutions.

Tutorial/Assignment	Post Date	Due Date	Handouts
Lab 1: Linear Models	Jan 16	Jan 26	Lab 01
Math 1:	Jan 16	Jan 26	Math HW 01
Lab 2: Multi-Layer Perceptrons with MedMNIST	Jan 23	Feb 9 ~~Feb 2~~	Lab 02 *
Lab 3: Word Embeddings	Jan 30	Feb 9	Lab 03 *
Project Proposal	Feb 12 ~~Feb 2~~	Feb 27 ~~Feb 16~~	Project (See Research Project)
Lab 4: Differential Privacy	Feb 13	Mar 1	Lab 04
Math 2:	Feb 29	Mar 11	Math HW 02
Lab 5: Transfer Learning and Descent	Mar 05	Mar 15	Lab 05
Lab 6: GradCAM and Input Gradients	Mar 12	Mar 22	Lab 06
Paper Reading Assignments (first two)	Jan 19	Mar 15
Lab 7: Text Classification using RNNs	Mar 19	Mar 30	Lab 07
Paper Reading Assignments (last three)	Jan 19	Apr 2
Take-Home Test	Apr 2	Apr 4	PDF Posted on Quercus – \LaTeX
Lab 8: Text Generation with Transformers	Mar 26	Apr 5	Lab 08
Project Report \& Code	Mar 26	Apr 19

* These two handouts were flipped. Please submit “Word Embeddings” (lab02_sol.ipynb) to Lab02 on MarkUs and “MLPs with MedMNIST” (lab03_sol.ipynb) to Lab03 on MarkUs.

All HW assignments (Lab or Math) are due on the Friday of the week after they are released.
Unless otherwise indicated, due times are at 5PM (Toronto time).
Should any handout (except for the test) be posted late, the due date will be at least 7 days after the post time.
All deliverables should be submitted via MarkUs: markus.teach.cs.toronto.edu/2024-01/.

Homeworks

There are ten homeworks worth 4% each: two (2) mathematical and eight (8) programming. These are to be completed individually or in groups of (2) students subject to the collaboration policy.

Research Project

The final project component will take the form of a research paper. Students will propose (10%) and conduct an investigation and/or devise a method (20%) that leads to a report and codebase that could reasonably be submitted to an academic journal, conference or workshop. Details about format and length will be shared at a later date. Students are expected to work in a group of three or four (3-4) subject to the collaboration policy. The project proposal is due on Feb 27, and is primarily a way for us to give you feedback on your project idea. The final report and code is due April 19. More details can be found in the project handout. We have uploaded a sample proposal rubric for your reference to give some insight into how the proposal was evaluated for previous iterations of the course. However, we emphasize that this rubric is not going to be followed exactly for our course, and is only meant to give you some ideas on how to write an effective proposal. Similarly, here is a sample report rubric.

Project Office Hour links

Sign-up is in the quercus calendar.

Project OH TA	Zoom link
Haonan (Simon) Duan	https://utoronto.zoom.us/j/82822970101
Umangi Jain	https://utoronto.zoom.us/j/84120671372
Amir Peimani	https://utoronto.zoom.us/j/86771705350
Asic Chen	https://utoronto.zoom.us/j/86886085935
Michael Cooper	https://utoronto.zoom.us/j/82970683470

Reading Assignments

In order to expose you to (and thus prepare you for) research and applications in deep learning, you are expected to read five (5) papers. Please read them and try to understand them as much as possible. It is not important that you completely understand a paper or go into detail of the proofs (if there is any), but you should put some effort into it.

After reading each paper, write a reflection that includes,

A summarizing paragraph (100-250 words): Highlight the main points of the paper. Ignore the less interesting aspects.
A discussions/extensions paragraph (100-250 words): Try to come up with one or two suggestions on how the method/idea/experiment described in the paper can be used or extended.

We recommend using the JMLR LaTeX style to write your reflection. These are to be completed individually, and without the assistance of AI tools.

There are two deadlines to submit to MarkUs.

Mar 15, for which you submit reflections of two papers.
Apr 2, for which you submit the reflections of three other papers.

Papers

The following papers (listed roughly chronologically) are a collection of seminal papers in DL, topics that we didn’t cover in lectures, or active research areas. You need to choose five (5) papers out of them, depending on your interest. ~~We will post the papers as the course progresses (so check here often).~~

Efficient Backprop
Efficient Estimation of Word Representations in Vector Space
Intriguing properties of neural networks
How transferable are features in deep neural networks? (NIPS 2014)
Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning (Nature Biotechnology)
Shortcut learning in deep neural networks (Nature Machine Intelligence)
Distilling the Knowledge in a Neural Network
U-Net: Convolutional Networks for Biomedical Image Segmentation (MICCAI 2015)
MADE: Masked Autoencoder for Distribution Estimation (ICML 2015)
Deep Neural Networks Are Easily Fooled: High Confidence Predictions for Unrecognizable Images (CVPR 2015)
Understanding deep learning requires rethinking generalization (ICLR 2017)
Semi-Supervised Classification with Graph Convolutional Networks (ICLR 2017)
Deep Information Propagation (ICLR 2017)
Understanding Black-box Predictions via Influence Functions (ICML 2017)
Mask R-CNN (ICCV 2017)
Deep Complex Networks (ICLR 2018)
Graph Attention Networks (ICLR 2018)
Deep Graph Infomax (ICLR 2019)
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding (NAACL 2019)
Prevalence of neural collapse during the terminal phase of deep learning training (PNAS 2020)
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis (ECCV 2020)
A Simple Framework for Contrastive Learning of Visual Representations (ICML 2020)
Bootstrap Your Own Latent - A New Approach to Self-Supervised Learning (NeurIPS 2020)
Learning Transferable Visual Models From Natural Language Supervision (ICML 2021)
BEiT: BERT Pre-Training of Image Transformers
ZerO Initialization: Initializing Neural Networks with only Zeros and Ones (ICLR 2022)
Normalizing Flows for Probabilistic Modeling and Inference (JMLR 22)
Masked Autoencoders Are Scalable Vision Learners (CVPR 2022)
Training language models to follow instructions with human feedback (NeurIPS 2022)
Git Re-Basin: Merging Models modulo Permutation Symmetries (ICLR 2023)
The Forward-Forward Algorithm: Some Preliminary Investigations

More papers may be added!

Note: This list only covers a tiny fraction of all great papers published in various ML/DL venues. Many of our favourites are not included.

Computation Resources

Many of the deep learning success stories in the recent years rely on the advances of modern GPU computing. The programming assignments here are lightweight comparing to the state of the art deep learning models in terms of their computation requirement. But we highly recommend you to debug your models and to complete the experiments on a modern GPU. Here are the list of free resources you have access to:

Google Colaboratory: A web-based iPython Notebook service that has access to a free Nvidia T4 GPU per Google account. PyTorch (recommended) and TensorFlow are both natively supported. Enhanced resources available through Kaggle. Recommended for homeworks and some projects.
Department Teaching Labs: Linux compute servers with desktop or datacentre-class GPUs. Recommended for projects.
Google Compute Engine: GCE delivers virtual machines running in Google’s data center. Recommended for course project.

Software

For the homework assignments, we will use Python, and libraries such as NumPy, SciPy, and scikit-learn. You have two options:

All the required packages are already installed on the Teaching Labs machines and Colab.
If you want to set up your own environment, we recommend that you use virtualenv or conda to create an environment and install the required packages. For example:

# if you are using conda
conda create --name csc413
source activate csc413

# if you are using virtualenv
virtualenv csc413
source csc413/bin/activate

# in both cases, install packages this way
pip install scipy numpy autograd matplotlib jupyter sklearn