T. Spooner, N. Vadori, S. Ganesh · Factored Policy Gradients: Leveraging Structure for Efficient Learning in MOMDPs · SlidesLive

Categories

EN

Log in Talk to sales

Next

Livestream will start soon!

Livestream has already ended.

Presentation has not been recorded yet!

SlidesLive

title: Factored Policy Gradients: Leveraging Structure for Efficient Learning in MOMDPs

0:00 / 0:00

Report Issue
Settings
Playlists
Bookmarks
Subtitles Off
Playback rate
Quality

Settings
Debug information
Server sl-yoda-v2-stream-007-alpha.b-cdn.net
Subtitles size Medium

Bookmarks

Server
sl-yoda-v2-stream-007-alpha.b-cdn.net
sl-yoda-v2-stream-007-beta.b-cdn.net
1678031076.rsc.cdn77.org
1932936657.rsc.cdn77.org

Subtitles
Off
English

Playback rate

Quality

Subtitles size
Large
Medium
Small

Mode
Video Slideshow
Audio Slideshow
Slideshow
Video

Factored Policy Gradients: Leveraging Structure for Efficient Learning in MOMDPs

Factored Policy Gradients: Leveraging Structure for Efficient Learning in MOMDPs

Dec 6, 2021

Speakers

T. Spooner

Speaker · 0 followers

N. Vadori

Speaker · 0 followers

S. Ganesh

Speaker · 0 followers

About

Policy gradient methods can solve complex tasks but often fail when the dimensionality of the action-space or objective multiplicity grow very large. This occurs, in part, because the variance on score-based gradient estimators scales quadratically. In this paper, we address this problem through a causal baseline which exploits independence structure encoded in a novel action-target influence network. Causal policy gradients (CPGs), which follow, provide a common framework for analysing key stat…

Organizer

NeurIPS 2021

Account · 1.9k followers

About NeurIPS 2021

Neural Information Processing Systems (NeurIPS) is a multi-track machine learning and computational neuroscience conference that includes invited talks, demonstrations, symposia and oral and poster presentations of refereed papers. Following the conference, there are workshops which provide a less formal setting.

Like the format? Trust SlidesLive to capture your next event!

Professional recording and live streaming, delivered globally.

Sharing

Recommended Videos

Presentations on similar topic, category or speaker

Recurrent Off-policy Baselines for Memory-based Continuous Control

04:31

Recurrent Off-policy Baselines for Memory-based Continuous Control

Watch later

Favorite

Zhihan Yang, …

NeurIPS 2021 3 years ago

Characterizing Transformer-based Models on MPC

04:53

Characterizing Transformer-based Models on MPC

Watch later

Favorite

Yongqin Wang, …

NeurIPS 2021 3 years ago

On the Convergence Theory of Debiased Model-Agnostic Meta-Reinforcement Learning

12:25

On the Convergence Theory of Debiased Model-Agnostic Meta-Reinforcement Learning

Watch later

Favorite

Alireza Fallah, …

NeurIPS 2021 3 years ago

A Unified DRO View of Multi-class Loss Functions with top-N Consistency

05:15

A Unified DRO View of Multi-class Loss Functions with top-N Consistency

Watch later

Favorite

Dixian Zhu, …

NeurIPS 2021 3 years ago

A Closer Look at Gradient Estimators with Reinforcement Learning as Inference

04:37

A Closer Look at Gradient Estimators with Reinforcement Learning as Inference

Watch later

Favorite

Jonathan Wilder Lavington, …

NeurIPS 2021 3 years ago

Second-Order Neural ODE Optimizer

14:59

Second-Order Neural ODE Optimizer

Watch later

Favorite

Guan-Horng Liu, …

NeurIPS 2021 3 years ago