Yunhao Tang, Tadashi Kozuno, Mark Rowland, Anna Harutyunyan, Remi Munos, Bernardo Avila Pires, Michal Valko · DoMo-AC: Doubly Multi-step Off-policy Actor-Critic Algorithm for RL · SlidesLive

Kategorien

DE

Anmelden Vertrieb kontaktieren

Next

Livestream will start soon!

Livestream has already ended.

Presentation has not been recorded yet!

SlidesLive

title: DoMo-AC: Doubly Multi-step Off-policy Actor-Critic Algorithm for RL

0:00 / 0:00

Report Issue
Settings
Playlists
Bookmarks
Subtitles Off
Playback rate
Quality

Settings
Debug information
Server sl-yoda-v2-stream-007-alpha.b-cdn.net
Subtitles size Medium

Bookmarks

Server
sl-yoda-v2-stream-007-alpha.b-cdn.net
sl-yoda-v2-stream-007-beta.b-cdn.net
1678031076.rsc.cdn77.org
1932936657.rsc.cdn77.org

Subtitles
Off
English

Playback rate

Quality

Subtitles size
Large
Medium
Small

Mode
Video Slideshow
Audio Slideshow
Slideshow
Video

DoMo-AC: Doubly Multi-step Off-policy Actor-Critic Algorithm for RL

DoMo-AC: Doubly Multi-step Off-policy Actor-Critic Algorithm for RL

Jul 24, 2023

Sprecher:innen

Yunhao Tang

Sprecher:in · 0 Follower:innen

Tadashi Kozuno

Sprecher:in · 0 Follower:innen

Mark Rowland

Sprecher:in · 0 Follower:innen

Über

Multi-step learning applies lookahead over multiple time steps and has proved valuable in policy evaluation settings. However, in the optimal control case, the impact of multi-step learning has been relatively limited despite a number of prior efforts. Fundamentally, this might be because multi-step policy improvements require operations that cannot be approximated by stochastic samples, hence hindering the widespread adoption of such methods in practice. To address such limitations, we introduc…

Organisator

ICML 2023

Konto · 657 Follower:innen

Gefällt euch das Format? Vertraut auf SlidesLive, um euer nächstes Event festzuhalten!

Professionelle Aufzeichnung und Livestreaming – weltweit.

Freigeben

Empfohlene Videos

Präsentationen, deren Thema, Kategorie oder Sprecher:in ähnlich sind

Opening Remarks

02:04

Opening Remarks

Später ansehen

Favorit

Courtney Paquette

ICML 2023 2 years ago

Beyond Homophily: Reconstructing Structure for Graph-agnostic Clustering

04:37

Beyond Homophily: Reconstructing Structure for Graph-agnostic Clustering

Später ansehen

Favorit

ICML 2023 2 years ago

ED-Batch: Efficient Automatic Batching of Dynamic Neural Networks via Learned Finite State Machines

05:31

ED-Batch: Efficient Automatic Batching of Dynamic Neural Networks via Learned Finite State Machines

Später ansehen

Favorit

Siyuan Chen, …

ICML 2023 2 years ago

Adversarial Cheap Talk

05:25

Adversarial Cheap Talk

Später ansehen

Favorit

ICML 2023 2 years ago

BEATs: Audio Pre-Training with Acoustic Tokenizers

04:57

BEATs: Audio Pre-Training with Acoustic Tokenizers

Später ansehen

Favorit

Sanyuan Chen, …

ICML 2023 2 years ago

Efficient Exploration via Epistemic-Risk-Seeking Policy Optimization

05:33

Efficient Exploration via Epistemic-Risk-Seeking Policy Optimization

Später ansehen

Favorit

Brendan O'Donoghue

ICML 2023 2 years ago