Next
Livestream will start soon!
Livestream has already ended.
Presentation has not been recorded yet!
  • title: Aggressive Q-Learning with Ensembles: Achieving Both High Sample Efficiency and High Asymptotic Performance
      0:00 / 0:00
      • Report Issue
      • Settings
      • Playlists
      • Bookmarks
      • Subtitles Off
      • Playback rate
      • Quality
      • Settings
      • Debug information
      • Server sl-yoda-v2-stream-004-alpha.b-cdn.net
      • Subtitles size Medium
      • Bookmarks
      • Server
      • sl-yoda-v2-stream-004-alpha.b-cdn.net
      • sl-yoda-v2-stream-004-beta.b-cdn.net
      • 1685195716.rsc.cdn77.org
      • 1239898752.rsc.cdn77.org
      • Subtitles
      • Off
      • English
      • Playback rate
      • Quality
      • Subtitles size
      • Large
      • Medium
      • Small
      • Mode
      • Video Slideshow
      • Audio Slideshow
      • Slideshow
      • Video
      My playlists
        Bookmarks
          00:00:00
            Aggressive Q-Learning with Ensembles: Achieving Both High Sample Efficiency and High Asymptotic Performance
            • Settings
            • Sync diff
            • Quality
            • Settings
            • Server
            • Quality
            • Server

            Aggressive Q-Learning with Ensembles: Achieving Both High Sample Efficiency and High Asymptotic Performance

            Dez 2, 2022

            Sprecher:innen

            YW

            Yanqiu Wu

            Sprecher:in · 0 Follower:innen

            XC

            Xinyue Chen

            Sprecher:in · 0 Follower:innen

            CW

            Che Wang

            Sprecher:in · 0 Follower:innen

            Über

            Recent advances in model-free deep reinforcement learning (DRL) show that simple model-free methods can be highly effective in challenging high-dimensional continuous control tasks. In particular, Truncated Quantile Critics (TQC) achieves state-of-the-art asymptotic training performance on the MuJoCo benchmark with a distributional representation of critics; and Randomized Ensemble Double Q-Learning (REDQ) achieves high sample efficiency that is competitive with state-of-the-art model-based meth…

            Organisator

            N2
            N2

            NeurIPS 2022

            Konto · 961 Follower:innen

            Gefällt euch das Format? Vertraut auf SlidesLive, um euer nächstes Event festzuhalten!

            Professionelle Aufzeichnung und Livestreaming – weltweit.

            Freigeben

            Empfohlene Videos

            Präsentationen, deren Thema, Kategorie oder Sprecher:in ähnlich sind

            Local Identifiability of Deep ReLU Neural Networks: the Theory
            04:28

            Local Identifiability of Deep ReLU Neural Networks: the Theory

            Joachim Bona-Pellissier, …

            N2
            N2
            NeurIPS 2022 2 years ago

            Ewigspeicher-Fortschrittswert: 0 = 0.0%

            Stochastic Gradient-Free Methods for  Nonsmooth Nonconvex Optimization
            05:24

            Stochastic Gradient-Free Methods for Nonsmooth Nonconvex Optimization

            Tianyi Lin, …

            N2
            N2
            NeurIPS 2022 2 years ago

            Ewigspeicher-Fortschrittswert: 0 = 0.0%

            ZSON: Zero-Shot Object-Goal Navigation using Multimodal Goal Embeddings
            05:04

            ZSON: Zero-Shot Object-Goal Navigation using Multimodal Goal Embeddings

            Arjun Majumdar, …

            N2
            N2
            NeurIPS 2022 2 years ago

            Ewigspeicher-Fortschrittswert: 0 = 0.0%

            LASSIE: Learning Articulated Shapes from Sparse Image Ensemble via 3D Part Discovery
            05:01

            LASSIE: Learning Articulated Shapes from Sparse Image Ensemble via 3D Part Discovery

            Chun-Han Yao, …

            N2
            N2
            NeurIPS 2022 2 years ago

            Ewigspeicher-Fortschrittswert: 0 = 0.0%

            APT-36K: A Large-scale Benchmark for Animal Pose Estimation and Tracking
            01:02

            APT-36K: A Large-scale Benchmark for Animal Pose Estimation and Tracking

            Yuxiang Yang, …

            N2
            N2
            NeurIPS 2022 2 years ago

            Ewigspeicher-Fortschrittswert: 0 = 0.0%

            Latent Hierarchical Causal Structure Discovery with Rank Constraints
            05:41

            Latent Hierarchical Causal Structure Discovery with Rank Constraints

            Biwei Huang, …

            N2
            N2
            NeurIPS 2022 2 years ago

            Ewigspeicher-Fortschrittswert: 0 = 0.0%

            Interessiert an Vorträgen wie diesem? NeurIPS 2022 folgen