- MDPs and Dynamic Programming
- Temporal Difference Methods
- Policy Gradient
- Contextual Bandits
- Stochastic Bandits
- Exploration in RL (optimism)
- Concentration Inequalities
- Monte-Carlo Tree Search
Representation Learning and RL, Hierarchical RL, Neuroscience and more.
More to be announced soon!