Schedule
-
EventDateDescriptionCourse Material
-
Lecture08/23/2022
TuesdayIntroductionSuggested Readings:
-
Lecture09/01/2022
ThursdayBasic Concepts in Reinforcement Learning[slides] -
Lecture09/06/2022
TuesdayMulti-Armed Bandits[slides] -
Assignment09/20/2022
TuesdayMP #1 - Bandit Algorithms released! -
Lecture09/22/2022
ThursdayMarkov Decision Process[slides] -
Lecture09/22/2022
ThursdayDynamic Programming[slides] -
Due09/22/2022 23:59
ThursdayProject Idea Post Due-
Required Content: Sales pitch about your course project idea, especially what makes you excited.
-
Purpose: Find your teammates who share the same passion and complementary skills
-
Where to Submit: Post on Piazza.
-
-
Lecture09/27/2022
TuesdayMonte Carlo Methods[slides] -
Quiz09/27/2022
TuesdayQuiz #1 - Bandit & RL Basics[solution]This quiz is designed to cover essential concepts in multi-armed bandit and basic concepts in reinforcement learning.
-
Due10/04/2022 23:59
TuesdayAssignment #1 due -
Due10/07/2022 23:59
FridayProject Proposal Due-
Required Template: You are required to use the latest ACM LaTex template for your project proposal. Among the provided templates from ACM, we ask you to use either the two column “sigconf” version or the single column “acmlarge”.
-
Maximum Length: 4 pages, excluding references and appendix.
-
Where to Submit: A collab submission page will be created. One group only needs to one proposal to collab; and please name your submission as “computingID[+computingID]*-proposal.pdf”, for example, “hw5x-cl5ev-proposal.pdf”.
-
-
Lecture10/13/2022
ThursdayTemporal-Difference Learning[slides] -
Assignment10/20/2022
ThursdayMP #2 - Markov Decision Process released! -
Assignment10/22/2022
SaturdayICLR2023 Review Assignment released! -
Quiz10/27/2022
ThursdayQuiz #2 - DP & MC[solution]This quiz is designed to cover essential concepts in dynamic programming and Monto Carlo methods.
-
Lecture11/01/2022
TuesdayPolicy Gradient Methods[slides] -
Due11/03/2022 23:59
ThursdayAssignment #2 due -
Due11/11/2022 23:59
FridayICLR2023 review due -
Assignment11/17/2022
ThursdayMP #3 - Policy Gradient method released! -
Lecture11/22/2022
TuesdayApproximation Methods[slides] -
Lecture11/28/2022
MondayDeep Reinforcement Learning[slides] -
Quiz11/29/2022
TuesdayQuiz #3 - TD & PG[solution]This quiz is designed to cover essential concepts in temporal difference method and policy gradient method.
-
Due12/01/2022 23:59
ThursdayAssignment #3 due -
Due12/12/2022 10:30
MondayProject Presentation-
Presentation Location: Rice 340.
-
Presentation Length: maximum 15 minutes presentation, including Q&A, given in person.
-
Presentation format: any format you prefer, power point slides or live demonstration.
-
-
Quiz12/12/2022
MondayQuiz #4 - DRL & Offline RL[solution]This quiz is designed to cover essential concepts in deep reinforcement learning and offline reinforcement learning methods.
-
Due12/15/2022 22:59
ThursdayProject Report Due-
Required Template: You should use the same template that you have used for your project proposal.
-
Maximum Length: 8 pages, excluding references and appendix.
-
Where to Submit: A collab submission page will be created. One group only needs to one report to collab; and please name your submission as “computingID[+computingID]*-report.pdf”, for example, “hw5x-cl5ev-report.pdf”.
-
-
Lecture12/16/2022
FridayOffline Reinforcement LearningSuggested Readings:
- Chapter 11: Off-policy Methods with Approximation
- NeurIPS 2020 Tutorial on Offline RL
- Offline reinforcement learning: Tutorial, review, and perspectives on open problems
- Doubly robust policy evaluation and learning
- Offline Reinforcement Learning as One Big Sequence Modeling Problem
- Is Pessimism Provably Efficient for Offline RL?