UCL ELLIS

Optimistic Algorithms for Online Learning in Structured Decision Problems

Speaker	Csaba Szepesvari
Affiliation	University of Alberta, Canada
Date	Friday, 17 April 2015
Time	11:00-12:00
Location	Zoom
Link	Malet Place Engineering Building 1.03
Event series	Jump Trading/ELLIS CSML Seminar Series
Abstract	I will describe two online stochastic learning problems that are highly structured. In the case of both problems, the structure will allow us to derive effective optimistic algorithms. In the first setting, the problem is to learn when to stop waiting for the arrival of some recurring event, such as learning the optimal disk spin-down time for mobile computers. This is a partial-information feedback problem with a continuous unbounded action space and a discontinuous loss function. Yet, the loss has other properties which can be used to design effective algorithms. In the second problem, the learning agent must distribute available resources among some jobs to maximize the number of completed jobs. Allocating more resources to a given job increases the probability that the job completes, but with a cut-off. The difficulty of each job is unknown initially. Again, I show that the problem's structure allows for an efficient and effective algorithm, which adapts to the actual difficulty of the problem (which ranges from polylogarithmic to polynomial regret). About the speaker: Csaba Szepesvari gained his PhD in 1999 from "Jozsef Attila" University, Szeged, Hungary and is currently an Associate Professor at the Department of Computing Science of the University of Alberta and a principal investigator of the Alberta Ingenuity Center for Machine Learning, with extensive experience in the software industry. He is the coauthor of a book on nonlinear approximate adaptive controllers and the author of a book on reinforcement learning, has published over 80 peer reviewed journal and conference papers, serves as the Associate Editor of IEEE Transactions on Adaptive Control and AI Communications, and as a member of the program committee at various machine learning and AI conferences. Areas of expertise include statistical machine learning, Markovian decision processes, reinforcement learning and nonlinear control.
Biography