\dm_csml_event_details UCL ELLIS

Non-convolutional architectures for recognition and generation


Speaker

Alexey Dosovitskiy

Affiliation

Google Brain

Date

Friday, 27 November 2020

Time

14:00-15:00

Location

Zoom

Link

Zoom

Event series

DeepMind/ELLIS CSML Seminar Series

Abstract

Slides:
https://www.dropbox.com/s/0s3sjpefn9e2pl9/Alexey_Dosovitskiy_Non_convolutional_architectures.pdf?dl=0

Convolutional networks are the workhorses of modern computer vision, thanks to their efficiency on hardware accelerators and the inductive biases suitable for processing and generating images. However, ConvNets distribute compute uniformly across the input, which makes them convenient to implement and train, but can be extremely computationally inefficient, especially on high-dimensional inputs such as video or 3D data. Moreover, representations extracted by ConvNets lack interpretability and systematic generalization. In this talk, I will present our recent work towards models that aim to avoid these shortcomings by respecting the sparse structure of the real world. On the image recognition front, we are investigating two directions: 1) architectures for learning object-centric representations either with or without supervision (Slot Attention); 2) large-scale non-convolutional models applied to real-world image recognition tasks (Vision Transformer). For image generation, we scale a recent implicit-3D-based neural rendering approach, Neural Radiance Fields, from controlled small-scale datasets to noisy large-scale real-world data (NeRF in the Wild).

Join Zoom Meeting
https://ucl.zoom.us/j/97094846920?pwd=MlYvNVZTN2llM2dZZVRpRFh5a1JHZz09

Biography