Why bigger is not always better: on finite and infinite neural networks

Jul 12, 2020



Recent work has shown that the outputs of convolutional neural networks become Gaussian process (GP) distributed when we take the number of channels to infinity. In principle, these infinite networks should perform very well, both because they allow for exact Bayesian inference, and because widening networks is generally thought to improve (or at least not diminish) performance. However, Bayesian infinite networks perform poorly in comparison to finite networks, and our goal here is to explain this discrepancy. We note that the high-level representation induced by an infinite network has very little flexibility; it depends only on network hyperparameters such as depth, and as such cannot learn a good high-level representation of data. In contrast, finite networks correspond to a rich prior over high-level representations, corresponding to a mixture of GPs with different kernel hyperparameters. We analyse this flexibility from the perspective of the prior (looking at the structured prior covariance of the top-level kernel), and from the perspective of the posterior, showing that the representation in a learned, finite deep linear network slowly transitions from the kernel induced by the inputs towards the kernel induced by the outputs, both for gradient descent, and for Langevin sampling. Finally, we explore representation learning in deep, convolutional, nonlinear networks, showing that learned representations differ dramatically from the corresponding infinite network.



About ICML 2020

The International Conference on Machine Learning (ICML) is the premier gathering of professionals dedicated to the advancement of the branch of artificial intelligence known as machine learning. ICML is globally renowned for presenting and publishing cutting-edge research on all aspects of machine learning used in closely related areas like artificial intelligence, statistics and data science, as well as important application areas such as machine vision, computational biology, speech recognition, and robotics. ICML is one of the fastest growing artificial intelligence conferences in the world. Participants at ICML span a wide range of backgrounds, from academic and industrial researchers, to entrepreneurs and engineers, to graduate students and postdocs.

Store presentation

Should this presentation be stored for 1000 years?

How do we store presentations

Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%


Recommended Videos

Presentations on similar topic, category or speaker

Interested in talks like this? Follow ICML 2020