Next
Livestream will start soon!
Livestream has already ended.
Presentation has not been recorded yet!
  • title: Regularization and Variance-Weighted Regression Achieves Minimax Optimality in Linear MDPs: Theory and Practice
      0:00 / 0:00
      • Report Issue
      • Settings
      • Playlists
      • Bookmarks
      • Subtitles Off
      • Playback rate
      • Quality
      • Settings
      • Debug information
      • Server sl-yoda-v2-stream-010-alpha.b-cdn.net
      • Subtitles size Medium
      • Bookmarks
      • Server
      • sl-yoda-v2-stream-010-alpha.b-cdn.net
      • sl-yoda-v2-stream-010-beta.b-cdn.net
      • 1759419103.rsc.cdn77.org
      • 1016618226.rsc.cdn77.org
      • Subtitles
      • Off
      • English
      • Playback rate
      • Quality
      • Subtitles size
      • Large
      • Medium
      • Small
      • Mode
      • Video Slideshow
      • Audio Slideshow
      • Slideshow
      • Video
      My playlists
        Bookmarks
          00:00:00
            Regularization and Variance-Weighted Regression Achieves Minimax Optimality in Linear MDPs: Theory and Practice
            • Settings
            • Sync diff
            • Quality
            • Settings
            • Server
            • Quality
            • Server

            Regularization and Variance-Weighted Regression Achieves Minimax Optimality in Linear MDPs: Theory and Practice

            Jul 24, 2023

            Speakers

            TK

            Toshinori Kitamura

            Řečník · 0 sledujících

            TK

            Tadashi Kozuno

            Řečník · 0 sledujících

            YT

            Yunhao Tang

            Řečník · 0 sledujících

            About

            Mirror descent value iteration (MDVI), an abstraction of Kullback-Leibler (KL) and entropy-regularized reinforcement learning (RL), has served as the basis for recent high-performing practical RL algorithms. However, despite the use of function approximation in practice, the theoretical understanding of MDVI has been limited to tabular Markov decision processes (MDPs). We study MDVI with linear function approximation through its sample complexity required to identify an ε-optimal policy with pro…

            Organizer

            I2
            I2

            ICML 2023

            Účet · 657 sledujících

            Like the format? Trust SlidesLive to capture your next event!

            Professional recording and live streaming, delivered globally.

            Sharing

            Recommended Videos

            Presentations on similar topic, category or speaker

            Sequential Strategic Screening
            04:56

            Sequential Strategic Screening

            Lee Cohen, …

            I2
            I2
            ICML 2023 2 years ago

            Pro uložení prezentace do věčného trezoru hlasovalo 0 diváků, což je 0.0 %

            Discovering Object-Centric Generalized Value Functions From Pixels
            05:23

            Discovering Object-Centric Generalized Value Functions From Pixels

            Somjit Nath, …

            I2
            I2
            ICML 2023 2 years ago

            Pro uložení prezentace do věčného trezoru hlasovalo 0 diváků, což je 0.0 %

            How to Query Human Feedback Efficiently in RL?
            11:10

            How to Query Human Feedback Efficiently in RL?

            Wenhao Zhan, …

            I2
            I2
            ICML 2023 2 years ago

            Pro uložení prezentace do věčného trezoru hlasovalo 0 diváků, což je 0.0 %

            E(n) Equivariant Message Passing Simplicial Networks
            05:04

            E(n) Equivariant Message Passing Simplicial Networks

            Floor Eijkelboom, …

            I2
            I2
            ICML 2023 2 years ago

            Pro uložení prezentace do věčného trezoru hlasovalo 0 diváků, což je 0.0 %

            Investigating the Role of Model-Based Learning in Exploration and Transfer
            05:09

            Investigating the Role of Model-Based Learning in Exploration and Transfer

            Jacob C Walker, …

            I2
            I2
            ICML 2023 2 years ago

            Pro uložení prezentace do věčného trezoru hlasovalo 0 diváků, což je 0.0 %

            Transforming Climate Modeling with AI: Hype or Reality?
            27:28

            Transforming Climate Modeling with AI: Hype or Reality?

            Laure Zanna

            I2
            I2
            ICML 2023 2 years ago

            Pro uložení prezentace do věčného trezoru hlasovalo 0 diváků, což je 0.0 %

            Interested in talks like this? Follow ICML 2023