Kwangjun Ahn, Xiang Cheng, Hadi Daneshmand, Suvrit Sra · Transformers learn to implement preconditioned gradient descent for in-context learning · SlidesLive

Kategorien

DE

Anmelden Vertrieb kontaktieren

Next

Livestream will start soon!

Livestream has already ended.

Presentation has not been recorded yet!

SlidesLive

title: Transformers learn to implement preconditioned gradient descent for in-context learning

0:00 / 0:00

Report Issue
Settings
Playlists
Bookmarks
Subtitles Off
Playback rate
Quality

Settings
Debug information
Server sl-yoda-v2-stream-007-alpha.b-cdn.net
Subtitles size Medium

Bookmarks

Server
sl-yoda-v2-stream-007-alpha.b-cdn.net
sl-yoda-v2-stream-007-beta.b-cdn.net
1678031076.rsc.cdn77.org
1932936657.rsc.cdn77.org

Subtitles
Off
English

Playback rate

Quality

Subtitles size
Large
Medium
Small

Mode
Video Slideshow
Audio Slideshow
Slideshow
Video

Transformers learn to implement preconditioned gradient descent for in-context learning

Transformers learn to implement preconditioned gradient descent for in-context learning

Dez 10, 2023

Sprecher:innen

Kwangjun Ahn

Sprecher:in · 0 Follower:innen

Xiang Cheng

Sprecher:in · 0 Follower:innen

Hadi Daneshmand

Sprecher:in · 0 Follower:innen

Über

Motivated by the striking ability of transformers for in-context learning, several works demonstrate that transformers can implement algorithms like gradient descent. By a careful construction of weights, these works show that multiple layers of transformers are expressive enough to simulate gradient descent iterations. Going beyond the question of expressivity, we ask: Can transformers can learn to implement such algorithms by training over random problem instances? To our knowledge, we make th…

Organisator

NeurIPS 2023

Konto · 646 Follower:innen

Gefällt euch das Format? Vertraut auf SlidesLive, um euer nächstes Event festzuhalten!

Professionelle Aufzeichnung und Livestreaming – weltweit.

Freigeben

Empfohlene Videos

Präsentationen, deren Thema, Kategorie oder Sprecher:in ähnlich sind

UniControl: A Unified Diffusion Model for Controllable Visual Generation In the Wild

04:34

UniControl: A Unified Diffusion Model for Controllable Visual Generation In the Wild

Später ansehen

Favorit

NeurIPS 2023 16 months ago

Breaking the Communication-Privacy-Accuracy Tradeoff with f-Differential Privacy

05:11

Breaking the Communication-Privacy-Accuracy Tradeoff with f-Differential Privacy

Später ansehen

Favorit

Richeng Jin, …

NeurIPS 2023 16 months ago

Bayesian Metric Learning for Uncertainty Quantification in Image Retrieval

04:41

Bayesian Metric Learning for Uncertainty Quantification in Image Retrieval

Später ansehen

Favorit

Frederik Warburg, …

NeurIPS 2023 16 months ago

Improving Self-supervised Molecular Representation Learning using Persistent Homology

05:02

Improving Self-supervised Molecular Representation Learning using Persistent Homology

Später ansehen

Favorit

Yuankai Luo, …

NeurIPS 2023 16 months ago

Inferring Hybrid Neural Fluid Fields from Videos

04:49

Inferring Hybrid Neural Fluid Fields from Videos

Später ansehen

Favorit

Hong-Xing Yu, …

NeurIPS 2023 16 months ago

DASpeech: Directed Acyclic Transformer for Fast and High-quality Speech-to-Speech Translation

04:58

DASpeech: Directed Acyclic Transformer for Fast and High-quality Speech-to-Speech Translation

Später ansehen

Favorit

Qingkai Fang, …

NeurIPS 2023 16 months ago