Next
Livestream will start soon!
Livestream has already ended.
Presentation has not been recorded yet!
  • title: LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale
      0:00 / 0:00
      • Report Issue
      • Settings
      • Playlists
      • Bookmarks
      • Subtitles Off
      • Playback rate
      • Quality
      • Settings
      • Debug information
      • Server sl-yoda-v2-stream-003-alpha.b-cdn.net
      • Subtitles size Medium
      • Bookmarks
      • Server
      • sl-yoda-v2-stream-003-alpha.b-cdn.net
      • sl-yoda-v2-stream-003-beta.b-cdn.net
      • 1544410162.rsc.cdn77.org
      • 1005514182.rsc.cdn77.org
      • Subtitles
      • Off
      • English
      • Playback rate
      • Quality
      • Subtitles size
      • Large
      • Medium
      • Small
      • Mode
      • Video Slideshow
      • Audio Slideshow
      • Slideshow
      • Video
      My playlists
        Bookmarks
          00:00:00
            LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale
            • Settings
            • Sync diff
            • Quality
            • Settings
            • Server
            • Quality
            • Server

            LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale

            Nov 28, 2022

            Speakers

            ML

            Mike Lewis

            Speaker · 1 follower

            YB

            Younes Belkada

            Speaker · 0 followers

            LZ

            Luke Zettlemoyer

            Speaker · 5 followers

            About

            Large language models have been widely adopted but require significant GPU memory for inference and finetuning. We develop methods for Int8 matrix multiplication for transformer multi-layer perceptron (MLP) and attention projection layers, which cut the required memory for inference by half while retaining full precision performance. With our method, a 16/32-bit checkpoint can be loaded, converted to Int8, and used immediately without performance degradation – no post-quantization training is re…

            Organizer

            N2
            N2

            NeurIPS 2022

            Account · 958 followers

            Like the format? Trust SlidesLive to capture your next event!

            Professional recording and live streaming, delivered globally.

            Sharing

            Recommended Videos

            Presentations on similar topic, category or speaker

            Panel Discussion: Challenges and lessons learned in deploying ML time series models
            54:56

            Panel Discussion: Challenges and lessons learned in deploying ML time series models

            Danielle Belgrave, …

            N2
            N2
            NeurIPS 2022 2 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            AZ-whiteness test: a test for signal uncorrelation on spatio-temporal graphs
            04:35

            AZ-whiteness test: a test for signal uncorrelation on spatio-temporal graphs

            Daniele Zambon, …

            N2
            N2
            NeurIPS 2022 2 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            3DOS: Towards Open Set 3D Learning: Benchmarking and Understanding Semantic Novelty Detection on Pointclouds
            05:24

            3DOS: Towards Open Set 3D Learning: Benchmarking and Understanding Semantic Novelty Detection on Pointclouds

            Antonio Alliegro, …

            N2
            N2
            NeurIPS 2022 2 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            Do we still need inductive biases after Transformer language models?
            29:44

            Do we still need inductive biases after Transformer language models?

            Siva Reddy

            N2
            N2
            NeurIPS 2022 2 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            Pyramid Dynamic Inference: Encouraging Faster Inference via Early Exit Boosting
            05:00

            Pyramid Dynamic Inference: Encouraging Faster Inference via Early Exit Boosting

            Ershad Banijamali, …

            N2
            N2
            NeurIPS 2022 2 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            MyoChallenge: Goal: Learn Physiological Dexterity
            11:20

            MyoChallenge: Goal: Learn Physiological Dexterity

            Yuval Tassa, …

            N2
            N2
            NeurIPS 2022 2 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            Interested in talks like this? Follow NeurIPS 2022