Fengdi Che, Xiru Zhu, Doina Precup, David Meger, Gregory Dudek · Bayesian Q-learning With Imperfect Expert Demonstrations · SlidesLive

Kategorien

DE

Anmelden Vertrieb kontaktieren

Next

Livestream will start soon!

Livestream has already ended.

Presentation has not been recorded yet!

SlidesLive

title: Bayesian Q-learning With Imperfect Expert Demonstrations

0:00 / 0:00

Report Issue
Settings
Playlists
Bookmarks
Subtitles Off
Playback rate
Quality

Settings
Debug information
Server sl-yoda-v2-stream-007-alpha.b-cdn.net
Subtitles size Medium

Bookmarks

Server
sl-yoda-v2-stream-007-alpha.b-cdn.net
sl-yoda-v2-stream-007-beta.b-cdn.net
1678031076.rsc.cdn77.org
1932936657.rsc.cdn77.org

Subtitles
Off
English

Playback rate

Quality

Subtitles size
Large
Medium
Small

Mode
Video Slideshow
Audio Slideshow
Slideshow
Video

Bayesian Q-learning With Imperfect Expert Demonstrations

Bayesian Q-learning With Imperfect Expert Demonstrations

Dez 2, 2022

Sprecher:innen

Fengdi Che

Sprecher:in · 0 Follower:innen

Xiru Zhu

Sprecher:in · 0 Follower:innen

Doina Precup

Sprecher:in · 17 Follower:innen

Über

Guided exploration with expert demonstrations improves data efficiency for reinforcement learning, but current algorithms often overuse expert information. We propose a novel algorithm to speed up Q-learning with the help of a limited amount of imperfect expert demonstrations. The algorithm is based on a Bayesian framework to model suboptimal expert actions and derives Q-values' update rules by maximizing the posterior probability. It weighs expert information by the uncertainty of learnt Q-valu…

Organisator

NeurIPS 2022

Konto · 962 Follower:innen

Gefällt euch das Format? Vertraut auf SlidesLive, um euer nächstes Event festzuhalten!

Professionelle Aufzeichnung und Livestreaming – weltweit.

Freigeben

Empfohlene Videos

Präsentationen, deren Thema, Kategorie oder Sprecher:in ähnlich sind

GlanceNets: Interpretable, Leak-proof Concept-based Models

08:06

GlanceNets: Interpretable, Leak-proof Concept-based Models

Später ansehen

Favorit

Emanuele Marconato, …

NeurIPS 2022 2 years ago

Stochastic Gradient-Free Methods for Nonsmooth Nonconvex Optimization

05:24

Stochastic Gradient-Free Methods for Nonsmooth Nonconvex Optimization

Später ansehen

Favorit

Tianyi Lin, …

NeurIPS 2022 2 years ago

Beyond Mahalanobis Distance for OOD Detection

04:59

Beyond Mahalanobis Distance for OOD Detection

Später ansehen

Favorit

Pierre Colombo, …

NeurIPS 2022 2 years ago

Panchromatic and Multispectral Image Fusion via Alternating Reverse Filtering Network

01:00

Panchromatic and Multispectral Image Fusion via Alternating Reverse Filtering Network

Später ansehen

Favorit

NeurIPS 2022 2 years ago

Panel: Greenhouse gas emissions and climate vulnerability impact assessment

59:48

Panel: Greenhouse gas emissions and climate vulnerability impact assessment

Später ansehen

Favorit

Peetak Mitra, …

NeurIPS 2022 2 years ago

Support Recovery in Sparse PCA with Incomplete Data

05:10

Support Recovery in Sparse PCA with Incomplete Data

Später ansehen

Favorit

Hanbyul Lee, …

NeurIPS 2022 2 years ago