Nasik Muhammad Nafi, Raja Farrukh Ali, William Hsu · Analyzing the Sensitivity to Policy-Value Decoupling in Deep Reinforcement Learning Generalization · SlidesLive

Kategorie

CS

Přihlásit se Kontaktujte nás

Další

Živý přenos začne již brzy!

Živý přenos již skončil.

Prezentace ještě nebyla nahrána!

SlidesLive

title: Analyzing the Sensitivity to Policy-Value Decoupling in Deep Reinforcement Learning Generalization

0:00 / 0:00

Nahlásit chybu
Nastavení
Playlisty
Záložky
Titulky Off
Rychlost přehrávání
Kvalita

Nastavení
Debug informace
Server sl-yoda-v2-stream-006-alpha.b-cdn.net
Velikost titulků Střední

Záložky

Server
sl-yoda-v2-stream-006-alpha.b-cdn.net
sl-yoda-v2-stream-006-beta.b-cdn.net
1549480416.rsc.cdn77.org
1102696603.rsc.cdn77.org

Titulky
Off
English

Rychlost přehrávání

Kvalita

Velikost titulků
Velké
Střední
Malé

Mode
Video Slideshow
Audio Slideshow
Slideshow
Video

Analyzing the Sensitivity to Policy-Value Decoupling in Deep Reinforcement Learning Generalization

Analyzing the Sensitivity to Policy-Value Decoupling in Deep Reinforcement Learning Generalization

2. prosince 2022

Řečníci

Nasik Muhammad Nafi

Sprecher:in · 0 Follower:innen

Raja Farrukh Ali

Sprecher:in · 0 Follower:innen

William Hsu

Sprecher:in · 0 Follower:innen

O prezentaci

Existence of policy-value representation asymmetry negatively affects the generalization capability of the traditional actor-critic architecture that uses a shared representation of policy and value. Fully separated networks for policy and value avoid overfitting by addressing this representation asymmetry. However, two separate networks introduce high computational overhead. Previous work has also shown that partial separation can achieve the same level of generalization in most tasks while red…

Organizátor

NeurIPS 2022

Konto · 961 Follower:innen

Baví vás formát? Nechte SlidesLive zachytit svou akci!

Profesionální natáčení a streamování po celém světě.

Sdílení

Doporučená videa

Prezentace na podobné téma, kategorii nebo přednášejícího

One-Inlier is First: Towards Efficient Position Encoding for Point Cloud Registration

04:21

One-Inlier is First: Towards Efficient Position Encoding for Point Cloud Registration

Später ansehen

Favorit

NeurIPS 2022 2 years ago

Anonymized Histograms in Intermediate Privacy Models

05:04

Anonymized Histograms in Intermediate Privacy Models

Später ansehen

Favorit

Badih Ghazi, …

NeurIPS 2022 2 years ago

Fast Instrument Learning with Faster Rates

04:34

Fast Instrument Learning with Faster Rates

Später ansehen

Favorit

NeurIPS 2022 2 years ago

Data-IQ: Characterizing subgroups with heterogeneous outcomes in tabular data

05:04

Data-IQ: Characterizing subgroups with heterogeneous outcomes in tabular data

Später ansehen

Favorit

Nabeel Seedat, …

NeurIPS 2022 2 years ago

Surrogate Modeling for Methane Dispersion Simulations Using Fourier Neural Operator

05:03

Surrogate Modeling for Methane Dispersion Simulations Using Fourier Neural Operator

Später ansehen

Favorit

NeurIPS 2022 2 years ago

Provable NN Repair for Global Robustness

07:03

Provable NN Repair for Global Robustness

Später ansehen

Favorit

NeurIPS 2022 2 years ago