UCL ELLIS

A conversion theorem and minimax optimality for continuum contextual bandits

Speaker	Arya Akhavan
Affiliation	University of Oxford
Date	Friday, 25 April 2025
Time	12:00-13:00
Location	UCL Centre for Artificial Intelligence, 1st Floor, 90 High Holborn, London WC1V 6BH
Link	https://ucl.zoom.us/j/99748820264
Event series	Jump Trading/ELLIS CSML Seminar Series
Abstract	I will talk about the contextual continuum bandits problem, where the learner sequentially receives a side-information vector and must choose an action from a convex set, minimizing a function associated with the context. The goal is to minimize all the underlying functions for the received contexts, leading to the contextual notion of regret, which is stronger than the standard static regret. Assuming that the objective functions are γ-Hölder with respect to the contexts, for some 0<γ≤1, I will describe a reduction that shows how any algorithm achieving sub-linear static regret can be extended to achieve sub-linear contextual regret. I will present a static-to-contextual regret conversion theorem, which provides an upper bound on the contextual regret of the output algorithm as a function of the static regret of the input algorithm. Then, I will showcase the implications of this general result for three fundamental cases of how the objective function depends on the action variable: (a) Lipschitz bandits, (b) convex bandits, (c) strongly convex and smooth bandits. Lastly, I will present a minimax lower bound that implies two key facts. First, obtaining sub-linear contextual regret may be impossible for functions that are not continuous with respect to the context. Second, for convex bandits and strongly convex and smooth bandits, the algorithms we propose achieve — up to a logarithmic factor — the minimax optimal rate of contextual regret as a function of the number of queries.
Biography	I am currently a postdoctoral research associate at the University of Oxford. Previously, I spent a year as a postdoc at CMAP, École Polytechnique, Paris. I am interested in optimization in all aspects and mainly its applications in online learning. I did my PhD at Crest, ENSAE, Institut Polytechnique de Paris, and Istituto Italiano di Tecnologia (IIT), Genova, where I was fortunate to be advised by Alexandre B. Tsybakov and Massimiliano Pontil. I defended my thesis on February 3, 2023.