Jianxiong Li, Xianyuan Zhan, Haoran Xu, Xiangyu Zhu, Jingjing Liu, Ya-Qin Zhang · Distance-Sensitive Offline Reinforcement Learning · SlidesLive

Kategorien

DE

Anmelden Vertrieb kontaktieren

Next

Livestream will start soon!

Livestream has already ended.

Presentation has not been recorded yet!

SlidesLive

title: Distance-Sensitive Offline Reinforcement Learning

0:00 / 0:00

Report Issue
Settings
Playlists
Bookmarks
Subtitles Off
Playback rate
Quality

Settings
Debug information
Server sl-yoda-v2-stream-008-alpha.b-cdn.net
Subtitles size Medium

Bookmarks

Server
sl-yoda-v2-stream-008-alpha.b-cdn.net
sl-yoda-v2-stream-008-beta.b-cdn.net
1159783934.rsc.cdn77.org
1511376917.rsc.cdn77.org

Subtitles
Off
English

Playback rate

Quality

Subtitles size
Large
Medium
Small

Mode
Video Slideshow
Audio Slideshow
Slideshow
Video

Distance-Sensitive Offline Reinforcement Learning

Distance-Sensitive Offline Reinforcement Learning

Dez 2, 2022

Sprecher:innen

Jianxiong Li

Sprecher:in · 0 Follower:innen

Xianyuan Zhan

Sprecher:in · 0 Follower:innen

Haoran Xu

Sprecher:in · 0 Follower:innen

Über

In offline reinforcement learning (RL), one detrimental issue to policy learning is the error accumulation of deep Q function in out-of-distribution (OOD) areas. Unfortunately, existing offline RL methods are often over-conservative, inevitably hurting generalization performance outside data distribution. In our study, one interesting observation is that deep Q functions approximate well inside the convex hull of training data. Inspired by this, we propose a new method, DOGE (Distance-sensitive…

Organisator

NeurIPS 2022

Konto · 961 Follower:innen

Gefällt euch das Format? Vertraut auf SlidesLive, um euer nächstes Event festzuhalten!

Professionelle Aufzeichnung und Livestreaming – weltweit.

Freigeben

Empfohlene Videos

Präsentationen, deren Thema, Kategorie oder Sprecher:in ähnlich sind

Giga-scale Kernel Matrix-Vector Multiplication on GPU

04:00

Giga-scale Kernel Matrix-Vector Multiplication on GPU

Später ansehen

Favorit

NeurIPS 2022 2 years ago

Learning Neural Acoustic Fields

04:33

Learning Neural Acoustic Fields

Später ansehen

Favorit

Andrew Luo, …

NeurIPS 2022 2 years ago

Language as Robot Middleware - Andy Zeng & Jacky Liang

36:05

Language as Robot Middleware - Andy Zeng & Jacky Liang

Später ansehen

Favorit

NeurIPS 2022 2 years ago

Fast Instrument Learning with Faster Rates

01:04

Fast Instrument Learning with Faster Rates

Später ansehen

Favorit

NeurIPS 2022 2 years ago

Causal Identification under Markov Equivalence: Calculus, Algorithm, and Completeness

05:01

Causal Identification under Markov Equivalence: Calculus, Algorithm, and Completeness

Später ansehen

Favorit

Amin Jaber, …

NeurIPS 2022 2 years ago

A Unified Statistical Learning Model for Rankings and Scores with Application to Grant Panel Review

04:07

A Unified Statistical Learning Model for Rankings and Scores with Application to Grant Panel Review

Später ansehen

Favorit

Michael Pearce, …

NeurIPS 2022 2 years ago