AI Seminar: Efficient exploration in sequential decision making problems
Yasin Abbasi-Yadkori, researcher at VinAI.
I will discuss recent results in designing more adaptive bandit algorithms. Our first approach is based on the bootstrap method and leads to a more efficient and data-dependent algorithm for the multi-armed bandit problem. Our second approach is a model-selection method for bandit problems. As an example of the usefulness of the approach, when the reward function is largely independent of the contexts, the method will automatically converge to the simpler and more efficient non-contextual algorithm.
This seminar is a part of the AI Seminar Series organised by SCIENCE AI Centre. The series highlights advances and challenges in research within Machine Learning, Data Science, and AI. Like the AI Centre itself, the seminar series has a broad scope, covering both new methodological contributions, ground-breaking applications, and impacts on society.