Ryosuke Unno, Yoshimasa Tsuruoka · Memory-Efficient Reinforcement Learning with Priority based on Surprise and On-policyness · SlidesLive

Categories

EN

Log in Talk to sales

Next

Livestream will start soon!

Livestream has already ended.

Presentation has not been recorded yet!

SlidesLive

title: Memory-Efficient Reinforcement Learning with Priority based on Surprise and On-policyness

0:00 / 0:00

Report Issue
Settings
Playlists
Bookmarks
Subtitles Off
Playback rate
Quality

Settings
Debug information
Server sl-yoda-v2-stream-001-alpha.b-cdn.net
Subtitles size Medium

Bookmarks

Server
sl-yoda-v2-stream-001-alpha.b-cdn.net
sl-yoda-v2-stream-001-beta.b-cdn.net
1824830694.rsc.cdn77.org
1979322955.rsc.cdn77.org

Subtitles
Off
English

Playback rate

Quality

Subtitles size
Large
Medium
Small

Mode
Video Slideshow
Audio Slideshow
Slideshow
Video

Memory-Efficient Reinforcement Learning with Priority based on Surprise and On-policyness

Memory-Efficient Reinforcement Learning with Priority based on Surprise and On-policyness

Dec 2, 2022

Speakers

Ryosuke Unno

Speaker · 0 followers

Yoshimasa Tsuruoka

Speaker · 0 followers

About

In off-policy reinforcement learning, an agent collects transition data (a.k.a. experience tuples) from the environment and stores them in a replay buffer for the incoming parameter updates. Storing those tuples consumes a large amount of memory when the environment observations are given as images. Large memory consumption is especially problematic when reinforcement learning methods are applied in scenarios where the computational resources are limited. In this paper, we introduce a method to…

Organizer

NeurIPS 2022

Account · 961 followers

Like the format? Trust SlidesLive to capture your next event!

Professional recording and live streaming, delivered globally.

Sharing

Recommended Videos

Presentations on similar topic, category or speaker

Online Min-max Optimization: Nonconvexity, Nonstationarity, and Dynamic Regret

05:39

Online Min-max Optimization: Nonconvexity, Nonstationarity, and Dynamic Regret

Watch later

Favorite

NeurIPS 2022 2 years ago

Panel RL Implementation

37:30

Panel RL Implementation

Watch later

Favorite

Alborz Geramifard, …

NeurIPS 2022 2 years ago

Generalizing Goal-Conditioned Reinforcement Learning with Variational Causal Reasoning

04:55

Generalizing Goal-Conditioned Reinforcement Learning with Variational Causal Reasoning

Watch later

Favorite

Wenhao Ding, …

NeurIPS 2022 2 years ago

Imagenary Patterns with Diffusion Models

28:09

Imagenary Patterns with Diffusion Models

Watch later

Favorite

Mohammad Norouzi

NeurIPS 2022 2 years ago

MABSplit: Faster Forest Training via Multi-Armed Bandits

04:40

MABSplit: Faster Forest Training via Multi-Armed Bandits

Watch later

Favorite

NeurIPS 2022 2 years ago

Depth is More Powerful than Width Prediction Concatenation in Deep Forest

03:22

Depth is More Powerful than Width Prediction Concatenation in Deep Forest

Watch later

Favorite

Shen-Huan Lyu, …

NeurIPS 2022 2 years ago