Next
Livestream will start soon!
Livestream has already ended.
Presentation has not been recorded yet!
  • title: Policy Gradient in Robust MDPs with Global Convergence Guarantee
      0:00 / 0:00
      • Report Issue
      • Settings
      • Playlists
      • Bookmarks
      • Subtitles Off
      • Playback rate
      • Quality
      • Settings
      • Debug information
      • Server sl-yoda-v2-stream-006-alpha.b-cdn.net
      • Subtitles size Medium
      • Bookmarks
      • Server
      • sl-yoda-v2-stream-006-alpha.b-cdn.net
      • sl-yoda-v2-stream-006-beta.b-cdn.net
      • 1549480416.rsc.cdn77.org
      • 1102696603.rsc.cdn77.org
      • Subtitles
      • Off
      • English
      • Playback rate
      • Quality
      • Subtitles size
      • Large
      • Medium
      • Small
      • Mode
      • Video Slideshow
      • Audio Slideshow
      • Slideshow
      • Video
      My playlists
        Bookmarks
          00:00:00
            Policy Gradient in Robust MDPs with Global Convergence Guarantee
            • Settings
            • Sync diff
            • Quality
            • Settings
            • Server
            • Quality
            • Server

            Policy Gradient in Robust MDPs with Global Convergence Guarantee

            Jul 24, 2023

            Sprecher:innen

            QW

            Qiuhao Wang

            Řečník · 0 sledujících

            CPH

            Chin Pang Ho

            Řečník · 0 sledujících

            MP

            Marek Petrik

            Řečník · 0 sledujících

            Über

            Robust Markov decision processes (RMDPs) represent a promising framework for computing reliable policies in the face of model errors. Many successful reinforcement learning algorithms build on variations of policy-gradient methods, but adapting these methods to RMDPs has been challenging. As a result, the applicability of RMDPs to large, practical domains remains limited. This paper proposes a new Double-Loop Robust Policy Gradient (DRPG), the first generic policy gradient method for RMDPs. In c…

            Organisator

            I2
            I2

            ICML 2023

            Účet · 657 sledujících

            Gefällt euch das Format? Vertraut auf SlidesLive, um euer nächstes Event festzuhalten!

            Professionelle Aufzeichnung und Livestreaming – weltweit.

            Freigeben

            Empfohlene Videos

            Präsentationen, deren Thema, Kategorie oder Sprecher:in ähnlich sind

            Rockmate: an Efficient, Fast, Automatic and Generic Tool for Re-materialization in PyTorch
            05:08

            Rockmate: an Efficient, Fast, Automatic and Generic Tool for Re-materialization in PyTorch

            Xunyi Zhao, …

            I2
            I2
            ICML 2023 2 years ago

            Pro uložení prezentace do věčného trezoru hlasovalo 0 diváků, což je 0.0 %

            Towards Reliable Neural Specifications
            07:13

            Towards Reliable Neural Specifications

            Chuqin Geng, …

            I2
            I2
            ICML 2023 2 years ago

            Pro uložení prezentace do věčného trezoru hlasovalo 0 diváků, což je 0.0 %

            Advances In Bits-Back Coding 2019-2023
            40:55

            Advances In Bits-Back Coding 2019-2023

            Karen Ullrich

            I2
            I2
            ICML 2023 2 years ago

            Pro uložení prezentace do věčného trezoru hlasovalo 0 diváků, což je 0.0 %

            Generative Pre-training for Black-Box Optimization
            05:03

            Generative Pre-training for Black-Box Optimization

            Satvik Mashkaria, …

            I2
            I2
            ICML 2023 2 years ago

            Pro uložení prezentace do věčného trezoru hlasovalo 0 diváků, což je 0.0 %

            Restoration based Generative Models
            04:46

            Restoration based Generative Models

            Jaemoo Choi, …

            I2
            I2
            ICML 2023 2 years ago

            Pro uložení prezentace do věčného trezoru hlasovalo 0 diváků, což je 0.0 %

            What can online reinforcement learning with function approximation benefit from coverage conditions?
            05:19

            What can online reinforcement learning with function approximation benefit from coverage conditions?

            Fanghui Liu, …

            I2
            I2
            ICML 2023 2 years ago

            Pro uložení prezentace do věčného trezoru hlasovalo 0 diváků, což je 0.0 %

            Interessiert an Vorträgen wie diesem? ICML 2023 folgen