\dm_csml_event_details UCL ELLIS

Contextual Thompson Sampling via Generation of Missing Data


Speaker

Kelly Zhang

Affiliation

Imperial College London

Date

Wednesday, 25 February 2026

Time

12:30-13:30

Location

Ground floor lecture theatre, Sainsbury Wellcome Center, 25 Howland St, W1T 4JG

Link

https://ucl.zoom.us/j/99748820264

Event series

Jump Trading/ELLIS CSML Seminar Series

Abstract

We introduce a framework for Thompson sampling (TS) contextual bandit algorithms, in which the algorithm's ability to quantify uncertainty and make decisions depends on the quality of a generative model that is learned offline. Instead of viewing uncertainty in the environment as arising from unobservable latent parameters, our algorithm treats uncertainty as stemming from missing, but potentially observable outcomes (including both future and counterfactual outcomes). If these outcomes were all observed, one could simply make decisions using an 'oracle' policy fit on the complete dataset. Inspired by this conceptualization, at each decision-time, our algorithm uses a generative model to probabilistically impute missing outcomes, fits a policy using the imputed complete dataset, and uses that policy to select the next action. We formally show that this algorithm is a generative formulation of TS and establish a state-of-the-art regret bound. Notably, our regret bound depends on the generative model only through the quality of its offline prediction loss, and applies to any method of fitting the 'oracle' policy.

Biography

I am an Assistant Professor at Imperial College London in the Mathematics Department (statistics section). My research interests lie at the intersection of adaptive experimentation, reinforcement learning, and statistical inference. I previously was a Postdoctoral Fellow at Columbia Business School in the Descision, Risk, and Optimization group, working with Daniel Russo and Hongseok Namkoong. I completed my Ph.D. student in computer science at Harvard University in the Statistical Reinforcement Learning Lab. I was advised by Susan Murphy and Lucas Janson. I was supported by an NSF Graduate Fellowship during my PhD and was selected to be a Siebel Scholar in 2023.