Maksym Andriushchenko, Aditya Varre, Loucas Pillaud-Vivien, Nicolas Flammarion · SGD with large step sizes learns sparse features · SlidesLive

Kategorien

DE

Anmelden Vertrieb kontaktieren

Next

Livestream will start soon!

Livestream has already ended.

Presentation has not been recorded yet!

SlidesLive

title: SGD with large step sizes learns sparse features

0:00 / 0:00

Report Issue
Settings
Playlists
Bookmarks
Subtitles Off
Playback rate
Quality

Settings
Debug information
Server sl-yoda-v2-stream-005-alpha.b-cdn.net
Subtitles size Medium

Bookmarks

Server
sl-yoda-v2-stream-005-alpha.b-cdn.net
sl-yoda-v2-stream-005-beta.b-cdn.net
1034628162.rsc.cdn77.org
1409346856.rsc.cdn77.org

Subtitles
Off
English

Playback rate

Quality

Subtitles size
Large
Medium
Small

Mode
Video Slideshow
Audio Slideshow
Slideshow
Video

SGD with large step sizes learns sparse features

SGD with large step sizes learns sparse features

Jul 24, 2023

Sprecher:innen

Maksym Andriushchenko

Řečník · 0 sledujících

Aditya Varre

Řečník · 0 sledujících

Loucas Pillaud-Vivien

Řečník · 0 sledujících

Über

We showcase important features of the dynamics of the Stochastic Gradient Descent (SGD) in the training of neural networks. We present empirical observations that commonly used large step sizes (i) may lead the iterates to jump from one side of a valley to the other causing loss stabilization, and (ii) this stabilization induces a hidden stochastic dynamics that biases it implicitly toward simple predictors. Furthermore, we show empirically that the longer large step sizes keep SGD high in the l…

Organisator

ICML 2023

Účet · 657 sledujících

Gefällt euch das Format? Vertraut auf SlidesLive, um euer nächstes Event festzuhalten!

Professionelle Aufzeichnung und Livestreaming – weltweit.

Freigeben

Empfohlene Videos

Präsentationen, deren Thema, Kategorie oder Sprecher:in ähnlich sind

Define, Evaluate, and Improve Task-Oriented Cognitive Capabilities for Instruction Generation Models

09:01

Define, Evaluate, and Improve Task-Oriented Cognitive Capabilities for Instruction Generation Models

Zhlédnout později

Oblíbené

Lingjun Zhao, …

ICML 2023 2 years ago

Reprogramming Pretrained Language Models for Antibody Sequence Infilling

05:29

Reprogramming Pretrained Language Models for Antibody Sequence Infilling

Zhlédnout později

Oblíbené

Igor Melnyk, …

ICML 2023 2 years ago

Causal Bounds in Quasi-Markovian Graphs

05:32

Causal Bounds in Quasi-Markovian Graphs

Zhlédnout později

Oblíbené

Madhumitha Shridharan, …

ICML 2023 2 years ago

SWARM Parallelism: Training Large Models Can Be Surprisingly Communication-Efficient

05:44

SWARM Parallelism: Training Large Models Can Be Surprisingly Communication-Efficient

Zhlédnout později

Oblíbené

Max Ryabinin, …

ICML 2023 2 years ago

Taxonomy-Structured Domain Adaptation (TSDA)

05:05

Taxonomy-Structured Domain Adaptation (TSDA)

Zhlédnout později

Oblíbené

Tianyi Liu, …

ICML 2023 2 years ago

Neurosymbolic Learning as a Path to Learning with Guarantees

25:51

Neurosymbolic Learning as a Path to Learning with Guarantees

Zhlédnout později

Oblíbené

Armando Solar-Lezama

ICML 2023 2 years ago