Jeremy Cohen, Simran Kaur, Yuanzhi Li, Zico Kolter, Ameet Talwalkar · Gradient Descent on Neural Networks Typically Occurs at the Edge of Stability · SlidesLive

Categories

Arts, Design & Media

Category · 1.2k presentations

Business & Economics

Category · 3.8k presentations

Computer Science & IT

Category · 14.8k presentations

Engineering & Technology

Category · 491 presentations

Humanities & Social Sciences

Category · 1.3k presentations

Medicine & Health

Category · 529 presentations

Natural & Formal Sciences

Category · 3.3k presentations

Self Development & Lifestyle

Category · 599 presentations

EN

Log in Talk to sales

Next

Livestream will start soon!

Livestream has already ended.

Presentation has not been recorded yet!

SlidesLive

title: Gradient Descent on Neural Networks Typically Occurs at the Edge of Stability

0:00 / 0:00

Report Issue
Settings
Playlists
Bookmarks
Subtitles Off
Playback rate
Quality

Settings
Debug information
Server sl-yoda-v2-stream-010-alpha.b-cdn.net
Subtitles size Medium

Bookmarks

Server
sl-yoda-v2-stream-010-alpha.b-cdn.net
sl-yoda-v2-stream-010-beta.b-cdn.net
1759419103.rsc.cdn77.org
1016618226.rsc.cdn77.org

Subtitles
Off
English

Playback rate

Quality

Subtitles size
Large
Medium
Small

Mode
Video Slideshow
Audio Slideshow
Slideshow
Video

Gradient Descent on Neural Networks Typically Occurs at the Edge of Stability

Gradient Descent on Neural Networks Typically Occurs at the Edge of Stability

May 3, 2021

Speakers

Jeremy Cohen

Speaker · 0 followers

Simran Kaur

Speaker · 0 followers

Yuanzhi Li

Speaker · 2 followers

About

We empirically demonstrate that full-batch gradient descent on neural network training objectives typically operates in a regime we call the Edge of Stability. In this regime, the leading eigenvalue of the training loss Hessian hovers just above the value $2 / \text{(step size)}$, and the training loss behaves non-monotonically over short timescales, yet consistently decreases over long timescales. Since this behavior is inconsistent with several widespread presumptions in the field of optimizat…

Organizer

ICLR 2021

Account · 887 followers

Categories

AI & Data Science

Category · 10.8k presentations

About ICLR 2021

The International Conference on Learning Representations (ICLR) is the premier gathering of professionals dedicated to the advancement of the branch of artificial intelligence called representation learning, but generally referred to as deep learning. ICLR is globally renowned for presenting and publishing cutting-edge research on all aspects of deep learning used in the fields of artificial intelligence, statistics and data science, as well as important application areas such as machine vision, computational biology, speech recognition, text understanding, gaming, and robotics.

Like the format? Trust SlidesLive to capture your next event!

Professional recording and live streaming, delivered globally.

Sharing

Recommended Videos

Presentations on similar topic, category or speaker

Multi-timescale Representation Learning in LSTM Language Models

04:57

Multi-timescale Representation Learning in LSTM Language Models

Watch later

Favorite

Shivangi Mahto, …

ICLR 2021 4 years ago

Training with Quantization Noise for Extreme Model Compression

04:58

Training with Quantization Noise for Extreme Model Compression

Watch later

Favorite

Pierre Stock, …

ICLR 2021 4 years ago

Limitations of Synthetic (Face) Data as a Fairness Intervention

39:05

Limitations of Synthetic (Face) Data as a Fairness Intervention

Watch later

Favorite

ICLR 2021 4 years ago

HIDIO: Hierarchical RL by Discovering Intrinsic Options

04:58

HIDIO: Hierarchical RL by Discovering Intrinsic Options

Watch later

Favorite

Jesse Zhang, …

ICLR 2021 4 years ago

Grounding Language to Autonomously Acquired Skills via Goal Generation

05:01

Grounding Language to Autonomously Acquired Skills via Goal Generation

Watch later

Favorite

Ahmed Akakzia, …

ICLR 2021 4 years ago

Learning to Deceive Knowledge Graph Augmented Models via Targeted Perturbation

05:15

Learning to Deceive Knowledge Graph Augmented Models via Targeted Perturbation

Watch later

Favorite

Mrigank Raman, …

ICLR 2021 4 years ago