UCL ELLIS

Kernel-based Reinforcement Learning

Speaker	Sattar Vakili
Affiliation	MediaTek Research
Date	Friday, 23 June 2023
Time	12:00-13:00
Location	Function Space, UCL Centre for Artificial Intelligence, 1st Floor, 90 High Holborn, London WC1V 6BH
Link	https://ucl.zoom.us/j/97245943682
Event series	DeepMind/ELLIS CSML Seminar Series
Abstract	Reinforcement Learning (RL) has shown great empirical success in various settings with complex models and large state-action spaces. However, the existing analytical results typically focus on settings with a small number of state-actions or simple models, such as linearly modeled state-action value functions. To derive RL policies that efficiently handle large state-action spaces with more general value functions, some recent works have explored nonlinear function approximation using kernel ridge regression. In this talk, we examine existing results in this RL setting, analytical tools, their limitations and some open problems. Moreover, we introduce a kernel based optimistic least-squares value iteration policy that achieves order optimal regret bounds for a common class of kernels.
Biography	Sattar Vakili is a senior AI researcher at MediaTek Research. He specializes in problems involving sequential decision-making in uncertain environments, with a focus on optimization, bandit and reinforcement learning, kernel-based modeling, and neural networks. Before joining MediaTek Research, Sattar worked at Secondmind.ai, a research lab in Cambridge, UK, led by Professor Carl Rasmussen, Cambridge University. There, he gained expertise in kernel-based and Gaussian process models. Prior to that, he was a postdoc at Princeton University, and he earned his PhD under the supervision of Professor Qing Zhao at Cornell University in 2017.