Next
Livestream will start soon!
Livestream has already ended.
Presentation has not been recorded yet!
  • title: SGD with large step sizes learns sparse features
      0:00 / 0:00
      • Report Issue
      • Settings
      • Playlists
      • Bookmarks
      • Subtitles Off
      • Playback rate
      • Quality
      • Settings
      • Debug information
      • Server sl-yoda-v2-stream-005-alpha.b-cdn.net
      • Subtitles size Medium
      • Bookmarks
      • Server
      • sl-yoda-v2-stream-005-alpha.b-cdn.net
      • sl-yoda-v2-stream-005-beta.b-cdn.net
      • 1034628162.rsc.cdn77.org
      • 1409346856.rsc.cdn77.org
      • Subtitles
      • Off
      • English
      • Playback rate
      • Quality
      • Subtitles size
      • Large
      • Medium
      • Small
      • Mode
      • Video Slideshow
      • Audio Slideshow
      • Slideshow
      • Video
      My playlists
        Bookmarks
          00:00:00
            SGD with large step sizes learns sparse features
            • Settings
            • Sync diff
            • Quality
            • Settings
            • Server
            • Quality
            • Server

            SGD with large step sizes learns sparse features

            Jul 24, 2023

            Speakers

            MA

            Maksym Andriushchenko

            Speaker · 0 followers

            AV

            Aditya Varre

            Speaker · 0 followers

            LP

            Loucas Pillaud-Vivien

            Speaker · 0 followers

            About

            We showcase important features of the dynamics of the Stochastic Gradient Descent (SGD) in the training of neural networks. We present empirical observations that commonly used large step sizes (i) may lead the iterates to jump from one side of a valley to the other causing loss stabilization, and (ii) this stabilization induces a hidden stochastic dynamics that biases it implicitly toward simple predictors. Furthermore, we show empirically that the longer large step sizes keep SGD high in the l…

            Organizer

            I2
            I2

            ICML 2023

            Account · 657 followers

            Like the format? Trust SlidesLive to capture your next event!

            Professional recording and live streaming, delivered globally.

            Sharing

            Recommended Videos

            Presentations on similar topic, category or speaker

            Tighter Information-Theoretic Generalization Bounds from Supersamples
            06:37

            Tighter Information-Theoretic Generalization Bounds from Supersamples

            Ziqiao Wang, …

            I2
            I2
            ICML 2023 2 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            Unsupervised Embedding Quality Evaluation
            11:02

            Unsupervised Embedding Quality Evaluation

            Anton Tsitsulin, …

            I2
            I2
            ICML 2023 2 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            Beyond the Universal Law of Robustness: Sharper Laws for Random Features and Neural Tangent Kernels
            05:06

            Beyond the Universal Law of Robustness: Sharper Laws for Random Features and Neural Tangent Kernels

            Simone Bombari, …

            I2
            I2
            ICML 2023 2 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            Towards Understanding and Reducing Graph Structural Noise for GNNs
            05:14

            Towards Understanding and Reducing Graph Structural Noise for GNNs

            Mingze Dong, …

            I2
            I2
            ICML 2023 2 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            Differentially Private Optimization on Large Model at Small Cost
            05:17

            Differentially Private Optimization on Large Model at Small Cost

            Zhiqi Bu, …

            I2
            I2
            ICML 2023 2 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            GRIL: a 2-parameter Persistence Based Vectorization for Machine Learning
            10:54

            GRIL: a 2-parameter Persistence Based Vectorization for Machine Learning

            Cheng Xin, …

            I2
            I2
            ICML 2023 2 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            Interested in talks like this? Follow ICML 2023