\dm_csml_event_details UCL ELLIS

Two generic principles in modern bandits: the optimistic principle and Thompson sampling


Speaker

Remi Munos

Affiliation

INRIA Lille

Date

Friday, 12 September 2014

Time

13:00-14:00

Location

Zoom

Link

Roberts G08 (Sir David Davies LT)

Event series

DeepMind/ELLIS CSML Seminar Series

Abstract

Abstract: I will describe two principles considered in multi-armed bandits, namely the optimistic principle and Thompson sampling, and illustrate how they extend to structured bandit settings, such as in linear bandits and bandits in graphs.

Bio: Remi Munos received his PhD in 1997 in Cognitive Science from EHESS, France, and did a postdoc at CMU from 1998-2000. Then he was Assistant Professor in the department of Applied Mathematics at Ecole Polytechnique. In 2006 he joined the French public research institute INRIA as a Senior Researcher and co-leaded the project-team SequeL (Sequential Learning) which now gather approximately 25 people. His research interests cover several fields of Statistical Learning including Reinforcement Learning, Optimization, and Bandit Theory.

Slides for the talk: PDF

Biography