Saeed Maleki, Madan Musuvathi, Todd Mytkowicz, Olli Saarikivi, Emad Barsoum, Sergii Dymchenko, Jaliya Ekanayake, Vadim Eksarevskiy, Maxim Lukyianov, Tianju Xu · Oral: Scaling Distributed Training with Adaptive Summation · SlidesLive

Categories

EN

Log in Talk to sales

Next

Livestream will start soon!

Livestream has already ended.

Presentation has not been recorded yet!

SlidesLive

title: Oral: Scaling Distributed Training with Adaptive Summation

0:00 / 0:00

Report Issue
Settings
Playlists
Bookmarks
Subtitles Off
Playback rate
Quality

Settings
Debug information
Server sl-yoda-v3-stream-012-alpha.b-cdn.net
Subtitles size Medium

Bookmarks

Server
sl-yoda-v3-stream-012-alpha.b-cdn.net
sl-yoda-v3-stream-012-beta.b-cdn.net
1338956956.rsc.cdn77.org
1656830687.rsc.cdn77.org

Subtitles
Off
English

Playback rate

Quality

Subtitles size
Large
Medium
Small

Mode
Video Slideshow
Audio Slideshow
Slideshow
Video

Oral: Scaling Distributed Training with Adaptive Summation

Oral: Scaling Distributed Training with Adaptive Summation

Apr 4, 2021

Speakers

Saeed Maleki

Speaker · 0 followers

Madan Musuvathi

Speaker · 0 followers

Todd Mytkowicz

Speaker · 0 followers

About

Data parallelism is a common way to parallelize stochastic gradient descent (SGD). However, the loss of convergence at large minibatch sizes limits the scalability of data parallelism. This paper introduces a novel method to combine gradients called Adasum that significantly improves the convergence when using large minibatches. This paper provides the intuition and formal justification of Adasum along with a convergence proof. Additionally, the paper describes an efficient implementation of Ada…

Organizer

MLSys 2021

Account · 159 followers

Categories

AI & Data Science

Category · 10.8k presentations

About MLSys 2021

The Conference on Machine Learning and Systems targets research at the intersection of machine learning and systems. The conference aims to elicit new connections amongst these fields, including identifying best practices and design principles for learning systems, as well as developing novel learning methods and theory tailored to practical machine learning workflows.

Like the format? Trust SlidesLive to capture your next event!

Professional recording and live streaming, delivered globally.

Sharing

Recommended Videos

Presentations on similar topic, category or speaker

Value Learning for Throughput Optimization of Deep Learning Workloads

05:03

Value Learning for Throughput Optimization of Deep Learning Workloads

Watch later

Favorite

Benoit Steiner, …

MLSys 2021 4 years ago

Equality Saturation for Tensor Graph Superoptimization

05:11

Equality Saturation for Tensor Graph Superoptimization

Watch later

Favorite

Yichen Yang, …

MLSys 2021 4 years ago

Lost in Pruning: The Effects of Pruning Neural Networks beyond Test Accuracy

05:51

Lost in Pruning: The Effects of Pruning Neural Networks beyond Test Accuracy

Watch later

Favorite

Lucas Liebenwein, …

MLSys 2021 4 years ago

Pipelined Backpropagation at Scale: Training Large Models without Batches

04:14

Pipelined Backpropagation at Scale: Training Large Models without Batches

Watch later

Favorite

Atli Kosson, …

MLSys 2021 4 years ago

Oral: CODE: Compiler-based Neuron-aware Ensemble training

18:47

Oral: CODE: Compiler-based Neuron-aware Ensemble training

Watch later

Favorite

Ettore M. G. Trainiti, …

MLSys 2021 4 years ago

Don't Forget to Sign the Gradients!

Don't Forget to Sign the Gradients!

Watch later

Favorite

Omid Aramoon, …

MLSys 2021 4 years ago