Next
Livestream will start soon!
Livestream has already ended.
Presentation has not been recorded yet!
  • title: Oral: Scaling Distributed Training with Adaptive Summation
      0:00 / 0:00
      • Report Issue
      • Settings
      • Playlists
      • Bookmarks
      • Subtitles Off
      • Playback rate
      • Quality
      • Settings
      • Debug information
      • Server sl-yoda-v3-stream-012-alpha.b-cdn.net
      • Subtitles size Medium
      • Bookmarks
      • Server
      • sl-yoda-v3-stream-012-alpha.b-cdn.net
      • sl-yoda-v3-stream-012-beta.b-cdn.net
      • 1338956956.rsc.cdn77.org
      • 1656830687.rsc.cdn77.org
      • Subtitles
      • Off
      • English
      • Playback rate
      • Quality
      • Subtitles size
      • Large
      • Medium
      • Small
      • Mode
      • Video Slideshow
      • Audio Slideshow
      • Slideshow
      • Video
      My playlists
        Bookmarks
          00:00:00
            Oral: Scaling Distributed Training with Adaptive Summation
            • Settings
            • Sync diff
            • Quality
            • Settings
            • Server
            • Quality
            • Server

            Oral: Scaling Distributed Training with Adaptive Summation

            Apr 4, 2021

            Speakers

            SM

            Saeed Maleki

            Speaker · 0 followers

            MM

            Madan Musuvathi

            Speaker · 0 followers

            TM

            Todd Mytkowicz

            Speaker · 0 followers

            About

            Data parallelism is a common way to parallelize stochastic gradient descent (SGD). However, the loss of convergence at large minibatch sizes limits the scalability of data parallelism. This paper introduces a novel method to combine gradients called Adasum that significantly improves the convergence when using large minibatches. This paper provides the intuition and formal justification of Adasum along with a convergence proof. Additionally, the paper describes an efficient implementation of Ada…

            Organizer

            M2
            M2

            MLSys 2021

            Account · 159 followers

            Categories

            AI & Data Science

            Category · 10.8k presentations

            About MLSys 2021

            The Conference on Machine Learning and Systems targets research at the intersection of machine learning and systems. The conference aims to elicit new connections amongst these fields, including identifying best practices and design principles for learning systems, as well as developing novel learning methods and theory tailored to practical machine learning workflows.

            Like the format? Trust SlidesLive to capture your next event!

            Professional recording and live streaming, delivered globally.

            Sharing

            Recommended Videos

            Presentations on similar topic, category or speaker

            Value Learning for Throughput Optimization of Deep Learning Workloads
            05:03

            Value Learning for Throughput Optimization of Deep Learning Workloads

            Benoit Steiner, …

            M2
            M2
            MLSys 2021 4 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            Equality Saturation for Tensor Graph Superoptimization
            05:11

            Equality Saturation for Tensor Graph Superoptimization

            Yichen Yang, …

            M2
            M2
            MLSys 2021 4 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            Lost in Pruning: The Effects of Pruning Neural Networks beyond Test Accuracy
            05:51

            Lost in Pruning: The Effects of Pruning Neural Networks beyond Test Accuracy

            Lucas Liebenwein, …

            M2
            M2
            MLSys 2021 4 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            Pipelined Backpropagation at Scale: Training Large Models without Batches
            04:14

            Pipelined Backpropagation at Scale: Training Large Models without Batches

            Atli Kosson, …

            M2
            M2
            MLSys 2021 4 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            Oral: CODE: Compiler-based Neuron-aware Ensemble training
            18:47

            Oral: CODE: Compiler-based Neuron-aware Ensemble training

            Ettore M. G. Trainiti, …

            M2
            M2
            MLSys 2021 4 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            Don't Forget to Sign the Gradients!

            Omid Aramoon, …

            M2
            M2
            MLSys 2021 4 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            Interested in talks like this? Follow MLSys 2021