Next
Livestream will start soon!
Livestream has already ended.
Presentation has not been recorded yet!
  • title: Horizon-Free and Variance-Dependent Reinforcement Learning for Latent Markov Decision Processes
      0:00 / 0:00
      • Report Issue
      • Settings
      • Playlists
      • Bookmarks
      • Subtitles Off
      • Playback rate
      • Quality
      • Settings
      • Debug information
      • Server sl-yoda-v2-stream-005-alpha.b-cdn.net
      • Subtitles size Medium
      • Bookmarks
      • Server
      • sl-yoda-v2-stream-005-alpha.b-cdn.net
      • sl-yoda-v2-stream-005-beta.b-cdn.net
      • 1034628162.rsc.cdn77.org
      • 1409346856.rsc.cdn77.org
      • Subtitles
      • Off
      • English
      • Playback rate
      • Quality
      • Subtitles size
      • Large
      • Medium
      • Small
      • Mode
      • Video Slideshow
      • Audio Slideshow
      • Slideshow
      • Video
      My playlists
        Bookmarks
          00:00:00
            Horizon-Free and Variance-Dependent Reinforcement Learning for Latent Markov Decision Processes
            • Settings
            • Sync diff
            • Quality
            • Settings
            • Server
            • Quality
            • Server

            Horizon-Free and Variance-Dependent Reinforcement Learning for Latent Markov Decision Processes

            Jul 24, 2023

            Speakers

            RZ

            Runlong Zhou

            Speaker · 0 followers

            RW

            Ruosong Wang

            Speaker · 0 followers

            SSD

            Simon Shaolei Du

            Speaker · 0 followers

            About

            We study regret minimization for reinforcement learning (RL) in Latent Markov Decision Processes (LMDPs) with context in hindsight. We design a novel model-based algorithmic framework which can be instantiated with both a model-optimistic and a value-optimistic solver. We prove an Õ(√(𝖵𝖺𝗋^⋆ M Γ S A K)) regret bound where Õ hides logarithm factors, M is the number of contexts, S is the number of states, A is the number of actions, K is the number of episodes, Γ< S is the maximum transition…

            Organizer

            I2
            I2

            ICML 2023

            Account · 657 followers

            Like the format? Trust SlidesLive to capture your next event!

            Professional recording and live streaming, delivered globally.

            Sharing

            Recommended Videos

            Presentations on similar topic, category or speaker

            D2Match: Leveraging Deep Learning and Degeneracy for Subgraph Matching
            05:04

            D2Match: Leveraging Deep Learning and Degeneracy for Subgraph Matching

            Xuanzhou Liu, …

            I2
            I2
            ICML 2023 2 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            Sketching for First Order Method: Efficient Algorithm for Low-Bandwidth Channel and Vulnerability
            05:02

            Sketching for First Order Method: Efficient Algorithm for Low-Bandwidth Channel and Vulnerability

            Yitan Wang, …

            I2
            I2
            ICML 2023 2 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            Preprocessors Matter! Realistic Decision-Based Attacks on Machine Learning Systems
            05:18

            Preprocessors Matter! Realistic Decision-Based Attacks on Machine Learning Systems

            Chawin Sitawarin, …

            I2
            I2
            ICML 2023 2 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            Uncertain Evidence in Probabilistic Models and Stochastic Simulators
            06:05

            Uncertain Evidence in Probabilistic Models and Stochastic Simulators

            Andreas Munk, …

            I2
            I2
            ICML 2023 2 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            Structure in Monge Maps by Engineering Costs
            47:56

            Structure in Monge Maps by Engineering Costs

            Marco Cuturi

            I2
            I2
            ICML 2023 2 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            The Benefits of Model-Based Generalization in Reinforcement Learning
            04:19

            The Benefits of Model-Based Generalization in Reinforcement Learning

            Kenny Young, …

            I2
            I2
            ICML 2023 2 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            Interested in talks like this? Follow ICML 2023