UCL ELLIS

Universal Value Function Approximators

Speaker	Tom Schaul
Affiliation	Google Deepmind
Date	Friday, 27 November 2015
Time	13:00-14:00
Location	Zoom
Link	Roberts G06 (Sir Ambrose Fleming lecture theatre)
Event series	Jump Trading/ELLIS CSML Seminar Series
Abstract	Abstract: Value functions are a core component of reinforcement learning (RL) systems. The main idea is to construct a single function approximator that estimates the long-term reward from any state. We introduce universal value function approximators (UVFAs) that generalise not just over states but also over goals. We develop an efficient technique for supervised learning of UVFAs, by factoring observed values into separate embedding vectors for state and goal, and then learning a mapping from state and goal to these factored embedding vectors. We show how this technique may be incorporated into a reinforcement learning algorithm that updates the UVFA solely from observed rewards. Finally, we demonstrate that a UVFA can successfully generalise to previously unseen goals, and can be scaled to complex RL problems such as learning to play Ms Pac-Man from pixels. Bio: Tom Schaul is a senior researcher at Google DeepMind in London, interested in robust, general-purpose learning algorithms. He thinks that progress is possible on general AI, and that games are the perfect benchmark domain for that. Tom did his PhD with Jürgen Schmidhuber at IDSIA and his Postdoc with Yann LeCun at NYU. Since 2008, he has published 40 papers on reinforcement learning, neural networks, artificial curiosity, evolution and other optimization algorithms.
Biography