Next
Livestream will start soon!
Livestream has already ended.
Presentation has not been recorded yet!
  • title: Bayesian Q-learning With Imperfect Expert Demonstrations
      0:00 / 0:00
      • Report Issue
      • Settings
      • Playlists
      • Bookmarks
      • Subtitles Off
      • Playback rate
      • Quality
      • Settings
      • Debug information
      • Server sl-yoda-v2-stream-007-alpha.b-cdn.net
      • Subtitles size Medium
      • Bookmarks
      • Server
      • sl-yoda-v2-stream-007-alpha.b-cdn.net
      • sl-yoda-v2-stream-007-beta.b-cdn.net
      • 1678031076.rsc.cdn77.org
      • 1932936657.rsc.cdn77.org
      • Subtitles
      • Off
      • English
      • Playback rate
      • Quality
      • Subtitles size
      • Large
      • Medium
      • Small
      • Mode
      • Video Slideshow
      • Audio Slideshow
      • Slideshow
      • Video
      My playlists
        Bookmarks
          00:00:00
            Bayesian Q-learning With Imperfect Expert Demonstrations
            • Settings
            • Sync diff
            • Quality
            • Settings
            • Server
            • Quality
            • Server

            Bayesian Q-learning With Imperfect Expert Demonstrations

            Dez 2, 2022

            Sprecher:innen

            FC

            Fengdi Che

            Sprecher:in · 0 Follower:innen

            XZ

            Xiru Zhu

            Sprecher:in · 0 Follower:innen

            DP

            Doina Precup

            Sprecher:in · 17 Follower:innen

            Über

            Guided exploration with expert demonstrations improves data efficiency for reinforcement learning, but current algorithms often overuse expert information. We propose a novel algorithm to speed up Q-learning with the help of a limited amount of imperfect expert demonstrations. The algorithm is based on a Bayesian framework to model suboptimal expert actions and derives Q-values' update rules by maximizing the posterior probability. It weighs expert information by the uncertainty of learnt Q-valu…

            Organisator

            N2
            N2

            NeurIPS 2022

            Konto · 962 Follower:innen

            Gefällt euch das Format? Vertraut auf SlidesLive, um euer nächstes Event festzuhalten!

            Professionelle Aufzeichnung und Livestreaming – weltweit.

            Freigeben

            Empfohlene Videos

            Präsentationen, deren Thema, Kategorie oder Sprecher:in ähnlich sind

            GlanceNets: Interpretable, Leak-proof Concept-based Models
            08:06

            GlanceNets: Interpretable, Leak-proof Concept-based Models

            Emanuele Marconato, …

            N2
            N2
            NeurIPS 2022 2 years ago

            Ewigspeicher-Fortschrittswert: 0 = 0.0%

            Stochastic Gradient-Free Methods for  Nonsmooth Nonconvex Optimization
            05:24

            Stochastic Gradient-Free Methods for Nonsmooth Nonconvex Optimization

            Tianyi Lin, …

            N2
            N2
            NeurIPS 2022 2 years ago

            Ewigspeicher-Fortschrittswert: 0 = 0.0%

            Beyond Mahalanobis Distance for OOD Detection
            04:59

            Beyond Mahalanobis Distance for OOD Detection

            Pierre Colombo, …

            N2
            N2
            NeurIPS 2022 2 years ago

            Ewigspeicher-Fortschrittswert: 0 = 0.0%

            Panchromatic and Multispectral Image Fusion via Alternating Reverse Filtering Network
            01:00

            Panchromatic and Multispectral Image Fusion via Alternating Reverse Filtering Network

            Keyu Yan, …

            N2
            N2
            NeurIPS 2022 2 years ago

            Ewigspeicher-Fortschrittswert: 0 = 0.0%

            Panel: Greenhouse gas emissions and climate vulnerability impact assessment
            59:48

            Panel: Greenhouse gas emissions and climate vulnerability impact assessment

            Peetak Mitra, …

            N2
            N2
            NeurIPS 2022 2 years ago

            Ewigspeicher-Fortschrittswert: 0 = 0.0%

            Support Recovery in Sparse PCA with Incomplete Data
            05:10

            Support Recovery in Sparse PCA with Incomplete Data

            Hanbyul Lee, …

            N2
            N2
            NeurIPS 2022 2 years ago

            Ewigspeicher-Fortschrittswert: 0 = 0.0%

            Interessiert an Vorträgen wie diesem? NeurIPS 2022 folgen