May 3, 2021
Speaker · 0 followers
Speaker · 0 followers
Recently, \cite{frankle2018lottery} demonstrated that randomly-initialized dense networks contain subnetworks that once found can be trained to reach test accuracy comparable to the trained dense network. However, finding these high performing trainable subnetworks is expensive, requiring iterative process of training and pruning weights. In this paper, we propose (and prove) a stronger \emph{Multi-Prize Lottery Ticket Hypothesis}: \emph{A sufficiently over-parameterized neural network with random weights contains several subnetworks (winning tickets) that (a) have comparable accuracy to a dense target network with learned weights (prize 1), (b) do not require any further training to achieve prize 1 (prize 2), and (c) is robust to extreme forms of quantization (i.e., binary weights and/or activation) (prize 3).} \noindent This provides a new paradigm for learning compact yet highly accurate binary neural networks by pruning and quantizing randomly weighted full precision neural networks. These multi-prize tickets enjoy a number of desirable properties including drastically reduced memory size, faster test-time inference, and lower power consumption compared to their dense and full-precision counterparts. Furthermore, we propose an algorithm for finding multi-prize tickets and test it by performing a series of experiments on CIFAR-10 and ImageNet datasets. Empirical results indicate that as models grow deeper and wider, untrained multi-prize tickets start to reach similar (and sometimes even higher) test accuracy compared to their significantly larger and full-precision counterparts that have been weight-trained. With minimal hyperparameter tuning, our binary weight multi-prize tickets outperform current state-of-the-art in binary neural networks.Recently, \cite{frankle2018lottery} demonstrated that randomly-initialized dense networks contain subnetworks that once found can be trained to reach test accuracy comparable to the trained dense network. However, finding these high performing trainable subnetworks is expensive, requiring iterative process of training and pruning weights. In this paper, we propose (and prove) a stronger \emph{Multi-Prize Lottery Ticket Hypothesis}: \emph{A sufficiently over-parameterized neural network with ran…
The International Conference on Learning Representations (ICLR) is the premier gathering of professionals dedicated to the advancement of the branch of artificial intelligence called representation learning, but generally referred to as deep learning. ICLR is globally renowned for presenting and publishing cutting-edge research on all aspects of deep learning used in the fields of artificial intelligence, statistics and data science, as well as important application areas such as machine vision, computational biology, speech recognition, text understanding, gaming, and robotics.
Professional recording and live streaming, delivered globally.
Presentations on similar topic, category or speaker
Matan Fitz, …