Mingrui Liu, Zhenxun Zhuang, Yunwen Lei, Chunyang Liao · A Communication-Efficient Distributed Gradient Clipping Algorithm for Training Deep Neural Networks · SlidesLive

Categories

EN

Log in Talk to sales

Next

Livestream will start soon!

Livestream has already ended.

Presentation has not been recorded yet!

SlidesLive

title: A Communication-Efficient Distributed Gradient Clipping Algorithm for Training Deep Neural Networks

0:00 / 0:00

Report Issue
Settings
Playlists
Bookmarks
Subtitles Off
Playback rate
Quality

Settings
Debug information
Server sl-yoda-v2-stream-008-alpha.b-cdn.net
Subtitles size Medium

Bookmarks

Server
sl-yoda-v2-stream-008-alpha.b-cdn.net
sl-yoda-v2-stream-008-beta.b-cdn.net
1159783934.rsc.cdn77.org
1511376917.rsc.cdn77.org

Subtitles
Off
English

Playback rate

Quality

Subtitles size
Large
Medium
Small

Mode
Video Slideshow
Audio Slideshow
Slideshow
Video

A Communication-Efficient Distributed Gradient Clipping Algorithm for Training Deep Neural Networks

A Communication-Efficient Distributed Gradient Clipping Algorithm for Training Deep Neural Networks

Nov 28, 2022

Speakers

Mingrui Liu

Speaker · 0 followers

Zhenxun Zhuang

Speaker · 0 followers

Yunwen Lei

Speaker · 0 followers

About

In distributed training of deep neural networks, people usually run Stochastic Gradient Descent (SGD) or its variants on each machine and communicate with other machines periodically. However, SGD might converge slowly in training some deep neural networks (e.g., RNN, LSTM) because of the exploding gradient issue. Gradient clipping is usually employed to address this issue in the single machine setting, but exploring this technique in the distributed setting is still in its infancy: it remains m…

Organizer

NeurIPS 2022

Account · 952 followers

Like the format? Trust SlidesLive to capture your next event!

Professional recording and live streaming, delivered globally.

Sharing

Recommended Videos

Presentations on similar topic, category or speaker

Scalable Distributional Robustness in a Class of Non-Convex Optimization with Guarantees

04:39

Scalable Distributional Robustness in a Class of Non-Convex Optimization with Guarantees

Watch later

Favorite

Avinandan Bose, …

NeurIPS 2022 2 years ago

Distributional Convergence of the Sliced Wasserstein Process

04:29

Distributional Convergence of the Sliced Wasserstein Process

Watch later

Favorite

NeurIPS 2022 2 years ago

DHRL: A Graph-Based Approach for Long-Horizon and Sparse Hierarchical Reinforcement Learning

04:21

DHRL: A Graph-Based Approach for Long-Horizon and Sparse Hierarchical Reinforcement Learning

Watch later

Favorite

Seungjae Lee, …

NeurIPS 2022 2 years ago

Embodied Scene-aware Human Pose Estimation

04:57

Embodied Scene-aware Human Pose Estimation

Watch later

Favorite

Zhengyi Luo, …

NeurIPS 2022 2 years ago

Impact of realistic properties of the point spread function on classification tasks to reveal a possible distribution shift

05:08

Impact of realistic properties of the point spread function on classification tasks to reveal a possible distribution shift

Watch later

Favorite

Patrick Müller, …

NeurIPS 2022 2 years ago

Driving SMARTS: Autonomous Driving Competition

13:55

Driving SMARTS: Autonomous Driving Competition

Watch later

Favorite

Tianpei Yang, …

NeurIPS 2022 2 years ago