Yaniv Leviathan, Matan Kalman, Yossi Matias · Fast Inference from Transformers via Speculative Decoding · SlidesLive

Kategorie

CS

Přihlásit se Kontaktujte nás

Další

Živý přenos začne již brzy!

Živý přenos již skončil.

Prezentace ještě nebyla nahrána!

SlidesLive

title: Fast Inference from Transformers via Speculative Decoding

0:00 / 0:00

Nahlásit chybu
Nastavení
Playlisty
Záložky
Titulky Off
Rychlost přehrávání
Kvalita

Nastavení
Debug informace
Server sl-yoda-v2-stream-001-alpha.b-cdn.net
Velikost titulků Střední

Záložky

Server
sl-yoda-v2-stream-001-alpha.b-cdn.net
sl-yoda-v2-stream-001-beta.b-cdn.net
1824830694.rsc.cdn77.org
1979322955.rsc.cdn77.org

Titulky
Off
English

Rychlost přehrávání

Kvalita

Velikost titulků
Velké
Střední
Malé

Mode
Video Slideshow
Audio Slideshow
Slideshow
Video

Fast Inference from Transformers via Speculative Decoding

Fast Inference from Transformers via Speculative Decoding

25. července 2023

Řečníci

Yaniv Leviathan

Řečník · 0 sledujících

Matan Kalman

Řečník · 0 sledujících

Yossi Matias

Řečník · 0 sledujících

O prezentaci

Inference from large autoregressive models like Transformers is slow - decoding K tokens takes K serial runs of the model. In this work we introduce speculative decoding - an algorithm to sample from autoregressive models faster without any changes to the outputs, by computing several tokens in parallel. At the heart of our approach lie the observations that (1) hard language-modeling tasks often include easier subtasks that can be approximated well by more efficient models, and (2) using specul…

Organizátor

ICML 2023

Účet · 657 sledujících

Baví vás formát? Nechte SlidesLive zachytit svou akci!

Profesionální natáčení a streamování po celém světě.

Sdílení

Doporučená videa

Prezentace na podobné téma, kategorii nebo přednášejícího

Disentangled Multiplex Graph Representation Learning

05:31

Disentangled Multiplex Graph Representation Learning

Zhlédnout později

Oblíbené

ICML 2023 2 years ago

Efficient Online Reinforcement Learning with Offline Data

05:25

Efficient Online Reinforcement Learning with Offline Data

Zhlédnout později

Oblíbené

Philip J. Ball, …

ICML 2023 2 years ago

Bandit Multi-linear DR-Submodular Maximization and Its Applications on Adversarial Submodular Bandits

05:15

Bandit Multi-linear DR-Submodular Maximization and Its Applications on Adversarial Submodular Bandits

Zhlédnout později

Oblíbené

Zongqi Wan, …

ICML 2023 2 years ago

Collaborative Causal Inference with Fair Incentives

05:09

Collaborative Causal Inference with Fair Incentives

Zhlédnout později

Oblíbené

ICML 2023 2 years ago

On Second-Order Scoring Rules for Epistemic Uncertainty Quantification

05:05

On Second-Order Scoring Rules for Epistemic Uncertainty Quantification

Zhlédnout později

Oblíbené

Viktor Bengs, …

ICML 2023 2 years ago

Constrained Phi-Equilibria: Mediators in constrained games

02:05

Constrained Phi-Equilibria: Mediators in constrained games

Zhlédnout později

Oblíbené

Martino Bernasconi, …

ICML 2023 2 years ago