Tri Dao, Daniel Y. Fu, Stefano Ermon, Atri Rudra, Christopher Ré · FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness · SlidesLive

Categories

EN

Log in Talk to sales

Next

Livestream will start soon!

Livestream has already ended.

Presentation has not been recorded yet!

SlidesLive

title: FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness

0:00 / 0:00

Report Issue
Settings
Playlists
Bookmarks
Subtitles Off
Playback rate
Quality

Settings
Debug information
Server sl-yoda-v2-stream-003-alpha.b-cdn.net
Subtitles size Medium

Bookmarks

Server
sl-yoda-v2-stream-003-alpha.b-cdn.net
sl-yoda-v2-stream-003-beta.b-cdn.net
1544410162.rsc.cdn77.org
1005514182.rsc.cdn77.org

Subtitles
Off
English

Playback rate

Quality

Subtitles size
Large
Medium
Small

Mode
Video Slideshow
Audio Slideshow
Slideshow
Video

FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness

FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness

Nov 28, 2022

Speakers

Tri Dao

Speaker · 3 followers

Daniel Y. Fu

Speaker · 0 followers

Stefano Ermon

Speaker · 15 followers

About

Transformers are slow and memory-hungry on long sequences, since the time and memory complexity of self-attention are quadratic in sequence length. Approximate attention methods have attempted to address this problem by trading off model quality to reduce the compute complexity, but often do not achieve wall-clock speedup. We argue that a missing principle is making attention algorithms IO-aware—accounting for reads and writes between levels of GPU memory. We propose FlashAttention, an IO-aware…

Organizer

NeurIPS 2022

Account · 953 followers

Like the format? Trust SlidesLive to capture your next event!

Professional recording and live streaming, delivered globally.

Sharing

Recommended Videos

Presentations on similar topic, category or speaker

Audio-Driven Co-Speech Gesture Image Generation

01:02

Audio-Driven Co-Speech Gesture Image Generation

Watch later

Favorite

NeurIPS 2022 2 years ago

Generative multitask learning mitigates target-causing confounding

05:15

Generative multitask learning mitigates target-causing confounding

Watch later

Favorite

Taro Makino, …

NeurIPS 2022 2 years ago

Equivariant Networks for Crystal Structures

04:31

Equivariant Networks for Crystal Structures

Watch later

Favorite

Sékou-Oumar Kaba, …

NeurIPS 2022 2 years ago

AniFaceGAN: Animatable 3D-Aware Face Image Generation for Video Avatars

04:39

AniFaceGAN: Animatable 3D-Aware Face Image Generation for Video Avatars

Watch later

Favorite

NeurIPS 2022 2 years ago

Online Minimax Multiobjective Optimization: Multicalibeating and Other Applications

05:36

Online Minimax Multiobjective Optimization: Multicalibeating and Other Applications

Watch later

Favorite

Daniel Lee, …

NeurIPS 2022 2 years ago

Conditional Moment Alignment for Improved Generalization in Federated Learning

11:59

Conditional Moment Alignment for Improved Generalization in Federated Learning

Watch later

Favorite

Jayanth Regatti, …

NeurIPS 2022 2 years ago