On the Generalization Benefit of Noise in Stochastic Gradient Descent

Jul 12, 2020

Sprecher:innen

Über

It has long been argued that minibatch stochastic gradient descent can generalize better than large batch gradient descent in deep neural networks. However recent papers have questioned this claim, arguing that this effect is simply a consequence of suboptimal hyperparameter tuning or insufficient compute budgets when the batch size is large. In this paper, we perform carefully designed experiments and rigorous hyperparameter sweeps on a range of popular models, which verify that small or moderately large batch sizes can substantially outperform very large batches on the test set. This occurs even when both models are trained for the same number of iterations and large batches achieve smaller training losses. Our results confirm that the noise in stochastic gradients can enhance generalization. We study how the optimal learning rate schedule changes as the epoch budget grows, and we provide a theoretical account of our observations based on the stochastic differential equation perspective of SGD dynamics.

Organisator

Kategorien

Über ICML 2020

The International Conference on Machine Learning (ICML) is the premier gathering of professionals dedicated to the advancement of the branch of artificial intelligence known as machine learning. ICML is globally renowned for presenting and publishing cutting-edge research on all aspects of machine learning used in closely related areas like artificial intelligence, statistics and data science, as well as important application areas such as machine vision, computational biology, speech recognition, and robotics. ICML is one of the fastest growing artificial intelligence conferences in the world. Participants at ICML span a wide range of backgrounds, from academic and industrial researchers, to entrepreneurs and engineers, to graduate students and postdocs.

Präsentation speichern

Soll diese Präsentation für 1000 Jahre gespeichert werden?

Wie speichern wir Präsentationen?

Ewigspeicher-Fortschrittswert: 0 = 0.0%

Freigeben

Empfohlene Videos

Präsentationen, deren Thema, Kategorie oder Sprecher:in ähnlich sind

Interessiert an Vorträgen wie diesem? ICML 2020 folgen