Category · 1.2k presentations
Category · 3.8k presentations
Category · 14.8k presentations
Category · 491 presentations
Category · 1.3k presentations
Category · 529 presentations
Category · 3.3k presentations
Category · 599 presentations
May 3, 2021
Speaker · 0 followers
Speaker · 0 followers
Speaker · 2 followers
Speaker · 2 followers
Speaker · 2 followers
We empirically demonstrate that full-batch gradient descent on neural network training objectives typically operates in a regime we call the Edge of Stability. In this regime, the leading eigenvalue of the training loss Hessian hovers just above the value $2 / \text{(step size)}$, and the training loss behaves non-monotonically over short timescales, yet consistently decreases over long timescales. Since this behavior is inconsistent with several widespread presumptions in the field of optimization, our findings raise questions as to whether these presumptions are relevant to neural network training. We hope that our findings will inspire future efforts aimed at rigorously understanding optimization at the Edge of Stability.We empirically demonstrate that full-batch gradient descent on neural network training objectives typically operates in a regime we call the Edge of Stability. In this regime, the leading eigenvalue of the training loss Hessian hovers just above the value $2 / \text{(step size)}$, and the training loss behaves non-monotonically over short timescales, yet consistently decreases over long timescales. Since this behavior is inconsistent with several widespread presumptions in the field of optimizat…
Category · 10.8k presentations
The International Conference on Learning Representations (ICLR) is the premier gathering of professionals dedicated to the advancement of the branch of artificial intelligence called representation learning, but generally referred to as deep learning. ICLR is globally renowned for presenting and publishing cutting-edge research on all aspects of deep learning used in the fields of artificial intelligence, statistics and data science, as well as important application areas such as machine vision, computational biology, speech recognition, text understanding, gaming, and robotics.
Professional recording and live streaming, delivered globally.
Presentations on similar topic, category or speaker
Pierre Stock, …
Jesse Zhang, …