\dm_csml_event_details UCL ELLIS

Weakly Supervised Learning from Videos


Speaker

Jean-Baptiste Alayrac

Affiliation

DeepMind

Date

Friday, 11 January 2019

Time

13:00-14:00

Location

Zoom

Link

Roberts G08

Event series

Jump Trading/ELLIS CSML Seminar Series

Abstract

In this talk, I will introduce and motivate the importance of weak supervision for computer vision, especially in the context of video understanding. I will then illustrate it on two challenging video tasks. The first one aims to learn the sequence of actions required to achieve complex human tasks (such as 'changing a car tire') only from narrated instructional videos [1,2]. The second one concerns jointly modeling manipulation actions with their effects on the state of objects (such as 'full/empty cup') [3]. Finally, I will conclude my talk by discussing some open challenges associated with weakly supervised learning, including learning from large-scale datasets [4,5] and how to use weak supervision in the context of deep learning.

References:

[1] Unsupervised Learning from narrated instruction videos, Alayrac et al, CVPR16
[2] Learning from narrated instruction videos, Alayrac et al, TPAMI17
[3] Joint Discovery of Object States and Manipulation Actions, Alayrac et al, ICCV17
[4] Learning from Video and Text via Large-Scale Discriminative Clustering, Miech, Alayrac et al, ICCV17
[5] Minding the Gaps for Block Frank-Wolfe Optimization of Structured SVMs, Alayrac et al, ICML16
[6] DIFFRAC : a discriminative and flexible framework for clustering, Bach and Harchaoui, NIPS07

Biography