UCL ELLIS

Weakly Supervised Learning from Videos

Speaker	Jean-Baptiste Alayrac
Affiliation	DeepMind
Date	Friday, 11 January 2019
Time	13:00-14:00
Location	Zoom
Link	Roberts G08
Event series	Jump Trading/ELLIS CSML Seminar Series
Abstract	In this talk, I will introduce and motivate the importance of weak supervision for computer vision, especially in the context of video understanding. I will then illustrate it on two challenging video tasks. The first one aims to learn the sequence of actions required to achieve complex human tasks (such as 'changing a car tire') only from narrated instructional videos [1,2]. The second one concerns jointly modeling manipulation actions with their effects on the state of objects (such as 'full/empty cup') [3]. Finally, I will conclude my talk by discussing some open challenges associated with weakly supervised learning, including learning from large-scale datasets [4,5] and how to use weak supervision in the context of deep learning. References: [1] Unsupervised Learning from narrated instruction videos, Alayrac et al, CVPR16 [2] Learning from narrated instruction videos, Alayrac et al, TPAMI17 [3] Joint Discovery of Object States and Manipulation Actions, Alayrac et al, ICCV17 [4] Learning from Video and Text via Large-Scale Discriminative Clustering, Miech, Alayrac et al, ICCV17 [5] Minding the Gaps for Block Frank-Wolfe Optimization of Structured SVMs, Alayrac et al, ICML16 [6] DIFFRAC : a discriminative and flexible framework for clustering, Bach and Harchaoui, NIPS07
Biography