Next
Livestream will start soon!
Livestream has already ended.
Presentation has not been recorded yet!
  • title: Oral: Horizontally Fused Training Array: An Effective Hardware Utilization Squeezer for Training Novel Deep Learning Models
      0:00 / 0:00
      • Report Issue
      • Settings
      • Playlists
      • Bookmarks
      • Subtitles Off
      • Playback rate
      • Quality
      • Settings
      • Debug information
      • Server sl-yoda-v2-stream-010-alpha.b-cdn.net
      • Subtitles size Medium
      • Bookmarks
      • Server
      • sl-yoda-v2-stream-010-alpha.b-cdn.net
      • sl-yoda-v2-stream-010-beta.b-cdn.net
      • 1759419103.rsc.cdn77.org
      • 1016618226.rsc.cdn77.org
      • Subtitles
      • Off
      • English
      • Playback rate
      • Quality
      • Subtitles size
      • Large
      • Medium
      • Small
      • Mode
      • Video Slideshow
      • Audio Slideshow
      • Slideshow
      • Video
      My playlists
        Bookmarks
          00:00:00
            Oral: Horizontally Fused Training Array: An Effective Hardware Utilization Squeezer for Training Novel Deep Learning Models
            • Settings
            • Sync diff
            • Quality
            • Settings
            • Server
            • Quality
            • Server

            Oral: Horizontally Fused Training Array: An Effective Hardware Utilization Squeezer for Training Novel Deep Learning Models

            Apr 4, 2021

            Speakers

            SW

            Shang Wang

            Speaker · 0 followers

            PY

            Peiming Yang

            Speaker · 0 followers

            YZ

            Yuxuan Zheng

            Speaker · 0 followers

            About

            Driven by the tremendous effort in researching novel deep learning (DL) algorithms, the training cost of developing new models increases staggeringly in recent years. To reduce this training cost and optimize the cluster-wide hardware resource usage, we collect and analyze “real-world” GPU cluster usage statistics. Our study reveals that single-accelerator training jobs can dominate the cluster-wide resource consumption when launched repetitively (e.g., for hyper-parameter tuning) while severely…

            Organizer

            M2
            M2

            MLSys 2021

            Account · 159 followers

            Categories

            AI & Data Science

            Category · 10.8k presentations

            About MLSys 2021

            The Conference on Machine Learning and Systems targets research at the intersection of machine learning and systems. The conference aims to elicit new connections amongst these fields, including identifying best practices and design principles for learning systems, as well as developing novel learning methods and theory tailored to practical machine learning workflows.

            Like the format? Trust SlidesLive to capture your next event!

            Professional recording and live streaming, delivered globally.

            Sharing

            Recommended Videos

            Presentations on similar topic, category or speaker

            Session 6: Benchmarks, Cost models, and Profiling
            1:37:16

            Session 6: Benchmarks, Cost models, and Profiling

            M2
            M2
            MLSys 2021 4 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            Graph Representation Learning for Chip Design
            37:03

            Graph Representation Learning for Chip Design

            Azalia Mirhoseini

            M2
            M2
            MLSys 2021 4 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            Designing and Optimizing AI Systems for Deep Learning Recommendation and Beyond
            38:52

            Designing and Optimizing AI Systems for Deep Learning Recommendation and Beyond

            Carole-Jean Wu

            M2
            M2
            MLSys 2021 4 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            Panel Session
            1:31:24

            Panel Session

            Tom St John, …

            M2
            M2
            MLSys 2021 4 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            Understanding and Improving Failure Tolerant Training for Deep Learning Recommendation with Partial Recovery

            Kiwan Maeng, …

            M2
            M2
            MLSys 2021 4 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            Challenges with Running DNN Workloads with Hardware Simulators
            41:42

            Challenges with Running DNN Workloads with Hardware Simulators

            David Kaeli, …

            M2
            M2
            MLSys 2021 4 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            Interested in talks like this? Follow MLSys 2021