Johan Obando Ceron, Marc Bellemare, Pablo Samuel Castro · Variance Double-Down: The Small Batch Size Anomaly in Multistep Deep Reinforcement Learning · SlidesLive

Kategorie

CS

Přihlásit se Kontaktujte nás

Další

Živý přenos začne již brzy!

Živý přenos již skončil.

Prezentace ještě nebyla nahrána!

SlidesLive

title: Variance Double-Down: The Small Batch Size Anomaly in Multistep Deep Reinforcement Learning

0:00 / 0:00

Nahlásit chybu
Nastavení
Playlisty
Záložky
Titulky Off
Rychlost přehrávání
Kvalita

Nastavení
Debug informace
Server sl-yoda-v2-stream-001-alpha.b-cdn.net
Velikost titulků Střední

Záložky

Server
sl-yoda-v2-stream-001-alpha.b-cdn.net
sl-yoda-v2-stream-001-beta.b-cdn.net
1824830694.rsc.cdn77.org
1979322955.rsc.cdn77.org

Titulky
Off
English

Rychlost přehrávání

Kvalita

Velikost titulků
Velké
Střední
Malé

Mode
Video Slideshow
Audio Slideshow
Slideshow
Video

Variance Double-Down: The Small Batch Size Anomaly in Multistep Deep Reinforcement Learning

Variance Double-Down: The Small Batch Size Anomaly in Multistep Deep Reinforcement Learning

2. prosince 2022

Řečníci

Johan Obando Ceron

Sprecher:in · 0 Follower:innen

Marc Bellemare

Sprecher:in · 3 Follower:innen

Pablo Samuel Castro

Sprecher:in · 1 Follower:in

O prezentaci

In deep reinforcement learning, multi-step learning is almost unavoidable to achieve state-of-the-art performance. However, the increased variance that multistep learning brings makes it difficult to increase the update horizon beyond relatively small numbers. In this paper, we report the counterintuitive finding that decreasing the batch size parameter improves the performance of many standard deep RL agents that use multi-step learning. It is well-known that gradient variance decreases with in…

Organizátor

NeurIPS 2022

Konto · 961 Follower:innen

Baví vás formát? Nechte SlidesLive zachytit svou akci!

Profesionální natáčení a streamování po celém světě.

Sdílení

Doporučená videa

Prezentace na podobné téma, kategorii nebo přednášejícího

Influencing Long-Term Behavior in Multiagent Reinforcement Learning

04:57

Influencing Long-Term Behavior in Multiagent Reinforcement Learning

Später ansehen

Favorit

Dong-Ki Kim, …

NeurIPS 2022 2 years ago

On-Device Training Under 256KB Memory

05:05

On-Device Training Under 256KB Memory

Später ansehen

Favorit

NeurIPS 2022 2 years ago

Imputation and forecasting for Multi-Output Gaussian Processes in Smart Grid

03:01

Imputation and forecasting for Multi-Output Gaussian Processes in Smart Grid

Später ansehen

Favorit

Jiangjiao Xu, …

NeurIPS 2022 2 years ago

On Scalable Testing of Samplers

04:24

On Scalable Testing of Samplers

Später ansehen

Favorit

NeurIPS 2022 2 years ago

Learning to Understand Plane Geometry Diagram

05:06

Learning to Understand Plane Geometry Diagram

Später ansehen

Favorit

Mingliang Zhang, …

NeurIPS 2022 2 years ago

Decomposing NeRF for Editing via Feature Field Distillation

04:52

Decomposing NeRF for Editing via Feature Field Distillation

Später ansehen

Favorit

Sosuke Kobayashi, …

NeurIPS 2022 2 years ago