Introduction to Reinforcement Learning

A course on reinforcement learning.

Introduction to Reinforcement Learning (Fall 2025)

You can find the Spring 2021 version of this course at here

This is an introductory course on reinforcement learning (RL) and sequential decision-making under uncertainty with an emphasis on understanding the theoretical foundation. We study how dynamic programming methods such as value and policy iteration can be used to solve sequential decision-making problems with known models, and how those approaches can be extended in order to solve reinforcement learning problems, where the model is unknown. Other topics include, but not limited to, function approximation in RL, policy gradient methods, model-based RL, and balancing the exploration-exploitation trade-off. The course will be delivered as a mix of lectures and reading of classical and recent papers assigned to students. As the emphasis is on understanding the foundation, you should expect to go through mathematical detail and proofs. Required background for this course includes being comfortable with probability theory and statistics, calculus, linear algebra, optimization, and (supervised) machine learning.


Announcements:


Teaching Staff

Time and Location:

Reading

The course material is based on Foundations of Reinforcement Learning. This is a live document that will change as we progress through the course. If you find a typo or mistake, please let me know. I collect the list of reported ones here.

Some other useful textbooks (incomplete list):

Lectures

This is a tentative schedule, and will adaptively change.

Note on videos: The videos will be publicly available on YouTube. If you don’t feel comfortable being recorded, make sure to turn off your camera when asking questions (though I really prefer to see all your faces when presenting a lecture, so it doesn’t feel that I am talking to void!).

This will be updated adaptively!

Week (date) Topics Lectures Reading
1
(Aug 25)
Introduction to Reinforcement Learning (Part I) slides (Intro), video (Intro)
slides (Intro – annotated - Part I)
Chapter 1 of FRL
1’
(Sept 1)
(No Lecture)
Tutorials: Math Background Review Probability, Linear Algebra, Optimization
   
2
(Sept 8)
Introduction to Reinforcement Learning (Part II)
Tutorial: Q Learning
slides (Intro – annotated - Part II)
Tutorial: QLearning (incomplete), Tutorial: QLearning (complete)
 
3
(Sept 15)
Structural Properties of Markov Decision Processes (Part I)
Tutorial (online): PyTorch
slides (MDP), video (MDP - Part I), video (MDP - Part II)
slides (MDP) – annotated)
Tutorial: PyTorch Tutorial: TorchRL
Chapter 2 FRL
4
(Sept 22)
Structural Properties of Markov Decision Processes (Part II)
   
5
(Sept 29)
Planning with a Known Model (Part I)
Tutorial (online): RL Environments
slides (Planning), video (Planning)
slides (Planning) – annotated)
RL Environments
Chapter 3 FRL
6
(Oct 6)
Planning with a Known Model (Part II)
Learning from a Stream of Data (Part I) (Friday)
slides (Learning from Stream), video (Learning from Stream - Part I), video (Learning from Stream - Part II)
slides (Learning from Stream) – annotated)
Chapter 4 FRL
7
(Oct 20)
Learning from a Stream of Data (Part II)
Tutorial: RL in Brain
Tutorial: A Neural Substrate of Prediction and Reward  
8
(Oct 27)
Value Function Approximation (Part I) slides (VFA), video (VFA - Part I), video (VFA - Part II), video (VFA - Part III) Chapter 5 FRL
9
(Nov 3)
Value Function Approximation (Part II)    
10
(Nov 10)
Value Function Approximation (Part III)    
11
(Nov 17)
Policy Search Methods (Part I)
Policy Search Methods (Part II – Guest Lecture)
slides (PS)
video (PS - Part I), video (PS - Part II)
Chapter 6 FRL
12
(Nov 24)
Model-based RL slides Chapter 7 FRL
13
(Dec 1)
Presentations    

Assignments and Coursework

These are the main components of the course. The details are described below. You need to use … to submit your solutions.

Homework Assignments

There will be five homework assignments. Your grade would be the average of the top four of them. The detail will be posted.

This is a tentative schedule of the homework assignments. We release each homework as soon as the lecture is finished and they will be due in two weeks after that. The deadline is 16:59. The exact date may change depending on the pace of lectures.

Homework # Out Due Materials TA Office Hours
Homework 1 Sept 23 Oct 7 Questions Friday (Sept 26, Oct 3)
Homework 2 Oct 21 Nov 5 Questions Code Friday (Oct. 24, Oct 31)
Homework 3 Nov ~10 Nov ~24 Questions Code  
Homework 4 Nov ~21 Dec ~5 Questions Code  

Research Project

Read the instruction here!

Reading Assignments

The following papers are a combination of classic papers in RL, topics that we didn’t cover in lectures, or active research areas. You need to choose five (5) papers out of them, depending on your interest. Please read them and try to understand them as much as possible. It is not important that you completely understand a paper or go into detail of the proofs (if there is any), but you should put some effort into it.

After reading each paper:

It is OK to discuss the paper with a Language Model (LM), after you read it yourself. But avoid using any LM to write your summary. The act of writing itself tremendously helps you understand the paper better. If you delegate it to a machine, you will lose the opportunity to learn and solidify your understanding. I also suggest that you avoid using an LM to come up with ideas. This is a practice for your creativity, not a machine’s.

These five assignments contribute 10% to your final mark. The reading assignments are only lightly evaluated. You should submit the summaries in two batches. The first batch should include two papers and is due on November 21st. The second batch should include three other papers and is due on December 12th. In both cases, submit only one PDF file, before 5PM.

We will post the papers as the course progresses. Please read and summarize them as we post them, so you won’t have a large workload close to the end of the semester.

Note: that this is an incomplete and biased list. I have many favourite papers that are not included in this short list.

Legend: