Michael Laskin, Luyu Wang, Junhyuk Oh, Emilio Parisotto, Stephen Spencer, Richie Steigerwald, Dj Strouse, Steven Hansen, Angelos Filos, Ethan Brooks, Maxime Gazeau, Himanshu Sahni, Satinder Singh, Volodymyr Mnih · In-context Reinforcement Learning with Algorithm Distillation · SlidesLive

Kategorie

CS

Přihlásit se Kontaktujte nás

Další

Živý přenos začne již brzy!

Živý přenos již skončil.

Prezentace ještě nebyla nahrána!

SlidesLive

title: In-context Reinforcement Learning with Algorithm Distillation

0:00 / 0:00

Nahlásit chybu
Nastavení
Playlisty
Záložky
Titulky Off
Rychlost přehrávání
Kvalita

Nastavení
Debug informace
Server sl-yoda-v2-stream-005-alpha.b-cdn.net
Velikost titulků Střední

Záložky

Server
sl-yoda-v2-stream-005-alpha.b-cdn.net
sl-yoda-v2-stream-005-beta.b-cdn.net
1034628162.rsc.cdn77.org
1409346856.rsc.cdn77.org

Titulky
Off
English

Rychlost přehrávání

Kvalita

Velikost titulků
Velké
Střední
Malé

Mode
Video Slideshow
Audio Slideshow
Slideshow
Video

In-context Reinforcement Learning with Algorithm Distillation

In-context Reinforcement Learning with Algorithm Distillation

2. prosince 2022

Řečníci

Michael Laskin

Řečník · 0 sledujících

Luyu Wang

Řečník · 0 sledujících

Junhyuk Oh

Řečník · 0 sledujících

O prezentaci

We propose Algorithm Distillation (AD), a method for distilling reinforcement learning (RL) algorithms into neural networks by modeling their training histories with a causal sequence model. Algorithm Distillation treats learning to reinforcement learn as an across-episode sequential prediction problem. A dataset of learning histories is generated by a source RL algorithm, and then a causal transformer is trained by autoregressively predicting actions given their preceding learning histories as…

Organizátor

NeurIPS 2022

Účet · 962 sledujících

Baví vás formát? Nechte SlidesLive zachytit svou akci!

Profesionální natáčení a streamování po celém světě.

Sdílení

Doporučená videa

Prezentace na podobné téma, kategorii nebo přednášejícího

On Sample Optimality in Personalized Collaborative and Federated Learning

04:44

On Sample Optimality in Personalized Collaborative and Federated Learning

Zhlédnout později

Oblíbené

Mathieu Even, …

NeurIPS 2022 2 years ago

Concept-based Understanding of Emergent Multi-Agent Behavior

05:13

Concept-based Understanding of Emergent Multi-Agent Behavior

Zhlédnout později

Oblíbené

Niko Grupen, …

NeurIPS 2022 2 years ago

Effective Backdoor Defense by Exploiting Sensitivity of Poisoned Samples

01:03

Effective Backdoor Defense by Exploiting Sensitivity of Poisoned Samples

Zhlédnout později

Oblíbené

Weixin Chen, …

NeurIPS 2022 2 years ago

Inception Transformer

04:55

Inception Transformer

Zhlédnout později

Oblíbené

Chenyang Si, …

NeurIPS 2022 2 years ago

Wasserstein Iterative Networks for Barycenter Estimation

04:51

Wasserstein Iterative Networks for Barycenter Estimation

Zhlédnout později

Oblíbené

Alexander Korotin, …

NeurIPS 2022 2 years ago

WinoGAViL: Gamified Association Benchmark to Challenge Vision-and-Language Models

05:02

WinoGAViL: Gamified Association Benchmark to Challenge Vision-and-Language Models

Zhlédnout později

Oblíbené

Yonatan Bitton, …

NeurIPS 2022 2 years ago