NOC:Reinforcement Learning

Lecture 1 - Tutorial 1 - Probability Basics 1

Lecture 2 - Tutorial 1 - Probability Basics 2

Lecture 3 - Tutorial 2 - Linear algebra - 1

Lecture 4 - Tutorial 2 - Linear algebra - 2

Lecture 5 - Introduction to RL

Lecture 6 - RL Framework and applications

Lecture 7 - Introduction to Immediate RL

Lecture 8 - Bandit Optimalities

Lecture 9 - Value function based methods

Lecture 10 - UCB 1

Lecture 11 - Concentration Bounds

Lecture 12 - UCB 1 Theorem

Lecture 13 - PAC Bounds

Lecture 14 - Median Elimination

Lecture 15 - Thompson Sampling

Lecture 16 - Policy Search

Lecture 17 - REINFORCE

Lecture 18 - Contextual Bandits

Lecture 19 - Full RL Introduction

Lecture 20 - Returns, Value Functions and MDPs

Lecture 21 - MDP Modelling

Lecture 22 - Bellman Equation

Lecture 23 - Bellman Optimality Equation

Lecture 24 - Cauchy Sequence and Green's Equation

Lecture 25 - Banach Fixed Point Theorem

Lecture 26 - Convergence Proof

Lecture 27 - Lpi Convergence

Lecture 28 - Value Iteration

Lecture 29 - Policy Iteration

Lecture 30 - Dynamic Programming

Lecture 31 - Monte Carlo

Lecture 32 - Control in Monte Carlo

Lecture 33 - Off Policy MC

Lecture 34 - UCT

Lecture 35 - TD(0)

Lecture 36 - TD(0) Control

Lecture 37 - Q-Learning

Lecture 38 - Afterstate

Lecture 39 - Eligibility Traces

Lecture 40 - Backward View of Eligibility Traces

Lecture 41 - Eligibility Trace Control

Lecture 42 - Thompson Sampling Recap

Lecture 43 - Function Approximation

Lecture 44 - Linear Parameterization

Lecture 45 - State Aggregation Methods

Lecture 46 - Function Approximation and Eligibility Traces

Lecture 47 - LSTD and LSTDQ

Lecture 48 - LSPI and Fitted Q

Lecture 49 - DQN and Fitted Q-Iteration

Lecture 50 - Policy Gradient Approach

Lecture 51 - Actor Critic and REINFORCE

Lecture 52 - REINFORCE (cont'd)

Lecture 53 - Policy Gradient with Function Approximation

Lecture 54 - Hierarchical Reinforcement Learning

Lecture 55 - Types of Optimality

Lecture 56 - Semi Markov Decision Processes

Lecture 57 - Options

Lecture 58 - Learning with Options

Lecture 59 - Hierarchical Abstract Machines

Lecture 60 - MAXQ

Lecture 61 - MAXQ Value Function Decomposition

Lecture 62 - Option Discovery

Lecture 63 - POMDP Introduction

Lecture 64 - Solving POMDP