Mike Lewis, Younes Belkada, Luke Zettlemoyer · LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale · SlidesLive

Kategorien

DE

Anmelden Vertrieb kontaktieren

Next

Livestream will start soon!

Livestream has already ended.

Presentation has not been recorded yet!

SlidesLive

title: LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale

0:00 / 0:00

Report Issue
Settings
Playlists
Bookmarks
Subtitles Off
Playback rate
Quality

Settings
Debug information
Server sl-yoda-v2-stream-003-alpha.b-cdn.net
Subtitles size Medium

Bookmarks

Server
sl-yoda-v2-stream-003-alpha.b-cdn.net
sl-yoda-v2-stream-003-beta.b-cdn.net
1544410162.rsc.cdn77.org
1005514182.rsc.cdn77.org

Subtitles
Off
English

Playback rate

Quality

Subtitles size
Large
Medium
Small

Mode
Video Slideshow
Audio Slideshow
Slideshow
Video

LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale

LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale

Nov 28, 2022

Sprecher:innen

Mike Lewis

Sprecher:in · 1 Follower:in

Younes Belkada

Sprecher:in · 0 Follower:innen

Luke Zettlemoyer

Sprecher:in · 5 Follower:innen

Über

Large language models have been widely adopted but require significant GPU memory for inference and finetuning. We develop methods for Int8 matrix multiplication for transformer multi-layer perceptron (MLP) and attention projection layers, which cut the required memory for inference by half while retaining full precision performance. With our method, a 16/32-bit checkpoint can be loaded, converted to Int8, and used immediately without performance degradation – no post-quantization training is re…

Organisator

NeurIPS 2022

Konto · 961 Follower:innen

Gefällt euch das Format? Vertraut auf SlidesLive, um euer nächstes Event festzuhalten!

Professionelle Aufzeichnung und Livestreaming – weltweit.

Freigeben

Empfohlene Videos

Präsentationen, deren Thema, Kategorie oder Sprecher:in ähnlich sind

Biologically-Plausible Determinant Maximization Neural Networks for Blind Separation of Correlated Sources

04:54

Biologically-Plausible Determinant Maximization Neural Networks for Blind Separation of Correlated Sources

Später ansehen

Favorit

Bariscan Bozkurt, …

NeurIPS 2022 2 years ago

AdaptFormer: Adapting Vision Transformers for Scalable Visual Recognition

04:53

AdaptFormer: Adapting Vision Transformers for Scalable Visual Recognition

Später ansehen

Favorit

Shoufa Chen, …

NeurIPS 2022 2 years ago

Global Optimal K-Medoids Clustering of One Million Samples

04:45

Global Optimal K-Medoids Clustering of One Million Samples

Später ansehen

Favorit

Jiayang Ren, …

NeurIPS 2022 2 years ago

Escaping Saddle Points by Bias-Variance Reduced Local Perturbed SGD for Nonconvex Distributed Optimization

01:02

Escaping Saddle Points by Bias-Variance Reduced Local Perturbed SGD for Nonconvex Distributed Optimization

Später ansehen

Favorit

Tomoya Murata, …

NeurIPS 2022 2 years ago

Sharpness-Aware Training for Free

05:11

Sharpness-Aware Training for Free

Später ansehen

Favorit

NeurIPS 2022 2 years ago

Online Frank-Wolfe with Arbitrary Delays

01:03

Online Frank-Wolfe with Arbitrary Delays

Später ansehen

Favorit

Yuanyu Wan, …

NeurIPS 2022 2 years ago