\dm_csml_event_details
Speaker |
Sattar Vakili |
---|---|
Affiliation |
MediaTek Research |
Date |
Friday, 23 June 2023 |
Time |
12:00-13:00 |
Location |
Function Space, UCL Centre for Artificial Intelligence, 1st Floor, 90 High Holborn, London WC1V 6BH |
Link |
https://ucl.zoom.us/j/97245943682 |
Event series |
Jump Trading/ELLIS CSML Seminar Series |
Abstract |
Reinforcement Learning (RL) has shown great empirical success in various settings with complex models and large state-action spaces. However, the existing analytical results typically focus on settings with a small number of state-actions or simple models, such as linearly modeled state-action value functions. To derive RL policies that efficiently handle large state-action spaces with more general value functions, some recent works have explored nonlinear function approximation using kernel ridge regression. In this talk, we examine existing results in this RL setting, analytical tools, their limitations and some open problems. Moreover, we introduce a kernel based optimistic least-squares value iteration policy that achieves order optimal regret bounds for a common class of kernels. |
Biography |
Sattar Vakili is a senior AI researcher at MediaTek Research. He specializes in problems involving sequential decision-making in uncertain environments, with a focus on optimization, bandit and reinforcement learning, kernel-based modeling, and neural networks. Before joining MediaTek Research, Sattar worked at Secondmind.ai, a research lab in Cambridge, UK, led by Professor Carl Rasmussen, Cambridge University. There, he gained expertise in kernel-based and Gaussian process models. Prior to that, he was a postdoc at Princeton University, and he earned his PhD under the supervision of Professor Qing Zhao at Cornell University in 2017. |