Next
Livestream will start soon!
Livestream has already ended.
Presentation has not been recorded yet!
  • title: A Communication-Efficient Distributed Gradient Clipping Algorithm for Training Deep Neural Networks
      0:00 / 0:00
      • Report Issue
      • Settings
      • Playlists
      • Bookmarks
      • Subtitles Off
      • Playback rate
      • Quality
      • Settings
      • Debug information
      • Server sl-yoda-v2-stream-008-alpha.b-cdn.net
      • Subtitles size Medium
      • Bookmarks
      • Server
      • sl-yoda-v2-stream-008-alpha.b-cdn.net
      • sl-yoda-v2-stream-008-beta.b-cdn.net
      • 1159783934.rsc.cdn77.org
      • 1511376917.rsc.cdn77.org
      • Subtitles
      • Off
      • English
      • Playback rate
      • Quality
      • Subtitles size
      • Large
      • Medium
      • Small
      • Mode
      • Video Slideshow
      • Audio Slideshow
      • Slideshow
      • Video
      My playlists
        Bookmarks
          00:00:00
            A Communication-Efficient Distributed Gradient Clipping Algorithm for Training Deep Neural Networks
            • Settings
            • Sync diff
            • Quality
            • Settings
            • Server
            • Quality
            • Server

            A Communication-Efficient Distributed Gradient Clipping Algorithm for Training Deep Neural Networks

            Nov 28, 2022

            Speakers

            ML

            Mingrui Liu

            Speaker · 0 followers

            ZZ

            Zhenxun Zhuang

            Speaker · 0 followers

            YL

            Yunwen Lei

            Speaker · 0 followers

            About

            In distributed training of deep neural networks, people usually run Stochastic Gradient Descent (SGD) or its variants on each machine and communicate with other machines periodically. However, SGD might converge slowly in training some deep neural networks (e.g., RNN, LSTM) because of the exploding gradient issue. Gradient clipping is usually employed to address this issue in the single machine setting, but exploring this technique in the distributed setting is still in its infancy: it remains m…

            Organizer

            N2
            N2

            NeurIPS 2022

            Account · 952 followers

            Like the format? Trust SlidesLive to capture your next event!

            Professional recording and live streaming, delivered globally.

            Sharing

            Recommended Videos

            Presentations on similar topic, category or speaker

            Scalable Distributional Robustness in a Class of Non-Convex Optimization with Guarantees
            04:39

            Scalable Distributional Robustness in a Class of Non-Convex Optimization with Guarantees

            Avinandan Bose, …

            N2
            N2
            NeurIPS 2022 2 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            Distributional Convergence of the Sliced Wasserstein Process
            04:29

            Distributional Convergence of the Sliced Wasserstein Process

            Jiaqi Xi, …

            N2
            N2
            NeurIPS 2022 2 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            DHRL: A Graph-Based Approach for Long-Horizon and Sparse Hierarchical Reinforcement Learning
            04:21

            DHRL: A Graph-Based Approach for Long-Horizon and Sparse Hierarchical Reinforcement Learning

            Seungjae Lee, …

            N2
            N2
            NeurIPS 2022 2 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            Embodied Scene-aware Human Pose Estimation
            04:57

            Embodied Scene-aware Human Pose Estimation

            Zhengyi Luo, …

            N2
            N2
            NeurIPS 2022 2 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            Impact of realistic properties of the point spread function on classification tasks to reveal a possible distribution shift
            05:08

            Impact of realistic properties of the point spread function on classification tasks to reveal a possible distribution shift

            Patrick Müller, …

            N2
            N2
            NeurIPS 2022 2 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            Driving SMARTS: Autonomous Driving Competition
            13:55

            Driving SMARTS: Autonomous Driving Competition

            Tianpei Yang, …

            N2
            N2
            NeurIPS 2022 2 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            Interested in talks like this? Follow NeurIPS 2022