UCL ELLIS

Speaker	Edward Grefenstette
Affiliation	DeepMind
Date	Friday, 08 June 2018
Time	13:00-14:00
Location	Zoom
Link	Roberts Building G08 Sir David Davies LT
Event series	Jump Trading/ELLIS CSML Seminar Series
Abstract	Reinforcement Learning (RL) generally presupposes the availability of possibly sparse–but primarily correct–reward signal from the environment, with which to reward an agent for behaving appropriately within the context of a task. Teaching agents to follow instructions using RL is a quintessentially multi-task problem: each instruction in a possibly combinatorially rich language corresponds to a specific task for which there must be a reward function against which the agent will learn. This has largely limited the RL community, thus far, to forms of instruction languages (e.g. templated instructions) where families of reward functions can be specified, and individual reward functions can be generated. In this talk, I discuss a new method which will allow us to take a step towards RL "in the wild", exploring a richer set of instruction languages, and enabling us to expose agents to a rich variety of tasks without needing to perpetually design reward functions over environment states.
Biography