profile-pic
profile-pic
UCL ELLIS
UCL, a global leader in AI and machine learning, has joined the ELLIS network with a new ELLIS Unit. ELLIS is a European AI network of excellence comprising Units within 30 research institutions. It focuses on fundamental science, technical innovation and societal impact. The ELLIS Unit at UCL spans across multiple departments (Gatsby Computational Neuroscience Unit, Department of Computer Science, Department of Statistical Science and Department of Electronic and Electrical Engineering).

“Some of the most effective learning algorithms are those that combine perspectives from many different models or parameters. This has always seemed a fitting metaphor for effective research. And now ELLIS will provide a new architecture to keep our real-life committee machine functioning --- reinforcing, deepening and enlarging the channels that connect us to colleagues throughout Europe At UCL we're excited to be a part of this movement to grow together. We look forward to sharing new collaborations, workshops, exchanges, joint studentships and more, and to the insight and breakthroughs that will undoubtedly follow. ”

Prof Maneesh Sahani
Director, Gatsby Computational Neuroscience Unit

“Advances in AI that benefit people and planet require global cooperation across disciplines and sectors. The ELLIS network is a vital part of that effort and UCL is proud to be a contributor. ”

Prof Geraint Rees
UCL Pro-Vice-Provost (AI)

News


Events


Feature Learning in Infinite-Width Neural Networks

Speaker: Greg Yang
Event Date: 26 February 2021

Abstract: As its width tends to infinity, a deep neural network's behavior under gradient descent can become simplified and predictable (e.g. given by the Neural Tangent Kernel (NTK)), if it is parametrized appropriately (e.g. the NTK parametrization). However, we show that the standard and NTK parametrizations of a neural network do not admit infinite-width limits that can learn representations (i.e. features), which is crucial for pretraining and transfer learning such as with BERT. We propose simple modifications to the standard parametrization to allow for feature learning in the limit. Using the *Tensor Programs* technique, we derive explicit formulas for such limits. On Word2Vec and few-shot learning on Omniglot via MAML, two canonical tasks that rely crucially on feature learning, we compute these limits exactly. We find that they outperform both NTK baselines and finite-width networks, with the latter approaching the infinite-width feature learning performance as width increases.

More generally, we classify a natural space of neural network parametrizations that generalizes standard, NTK, and Mean Field parametrizations. We show 1) any parametrization in this space either admits feature learning or has an infinite-width training dynamics given by kernel gradient descent, but not both; 2) any such infinite-width limit can be computed using the Tensor Programs technique.

This work is based on https://arxiv.org/abs/2011.14522.

Principles for Tackling Distribution Shift: Pessimism, Adaptation, and Anticipation

Speaker: Chelsea Finn
Event Date: 19 February 2021

Abstract: While we have seen immense progress in machine learning, a critical shortcoming of current methods lies in handling distribution shift between training and deployment. Distribution shift is pervasive in real-world problems ranging from natural variation in the distribution over locations or domains, to shift in the distribution arising from different decision making policies, to shifts over time as the world changes. In this talk, I’ll discuss three general principles for tackling these forms of distribution shift: pessimism, adaptation, and anticipation. I’ll present the most general form of each principle before providing concrete instantiations of using each in practice. This will include a simple method for substantially improving robustness to spurious correlations, a framework for quickly adapting a model to a new user or domain with only unlabeled data, and an algorithm that enables robots to anticipate and adapt to shifts caused by other agents.

Bio: Chelsea Finn is an Assistant Professor in Computer Science and Electrical Engineering at Stanford University. Finn's research interests lie in the capability of robots and other agents to develop broadly intelligent behavior through learning and interaction. To this end, her work has included deep learning algorithms for concurrently learning visual perception and control in robotic manipulation skills, inverse reinforcement methods for scalable acquisition of nonlinear reward functions, and meta-learning algorithms that can enable fast, few-shot adaptation in both visual perception and deep reinforcement learning. Finn received her Bachelor's degree in Electrical Engineering and Computer Science at MIT and her PhD in Computer Science at UC Berkeley. Her research has been recognized through the ACM doctoral dissertation award, the Microsoft Research Faculty Fellowship, the C.V. Ramamoorthy Distinguished Research Award, and the MIT Technology Review 35 under 35 Award, and her work has been covered by various media outlets, including the New York Times, Wired, and Bloomberg. Throughout her career, she has sought to increase the representation of underrepresented minorities within CS and AI by developing an AI outreach camp at Berkeley for underprivileged high school students, a mentoring program for underrepresented undergraduates across four universities, and leading efforts within the WiML and Berkeley WiCSE communities of women researchers.

People


Computer Science

Gatsby Computational Neuroscience Unit

Department of Statistical Science

Department of Electronic and Electrical Engineering