Next
Livestream will start soon!
Livestream has already ended.
Presentation has not been recorded yet!
  • title: Improved Regret for Efficient Online Reinforcement Learning with Linear Function Approximation
      0:00 / 0:00
      • Report Issue
      • Settings
      • Playlists
      • Bookmarks
      • Subtitles Off
      • Playback rate
      • Quality
      • Settings
      • Debug information
      • Server sl-yoda-v2-stream-004-alpha.b-cdn.net
      • Subtitles size Medium
      • Bookmarks
      • Server
      • sl-yoda-v2-stream-004-alpha.b-cdn.net
      • sl-yoda-v2-stream-004-beta.b-cdn.net
      • 1685195716.rsc.cdn77.org
      • 1239898752.rsc.cdn77.org
      • Subtitles
      • Off
      • English
      • Playback rate
      • Quality
      • Subtitles size
      • Large
      • Medium
      • Small
      • Mode
      • Video Slideshow
      • Audio Slideshow
      • Slideshow
      • Video
      My playlists
        Bookmarks
          00:00:00
            Improved Regret for Efficient Online Reinforcement Learning with Linear Function Approximation
            • Settings
            • Sync diff
            • Quality
            • Settings
            • Server
            • Quality
            • Server

            Improved Regret for Efficient Online Reinforcement Learning with Linear Function Approximation

            Jul 24, 2023

            Speakers

            US

            Uri Sherman

            Speaker · 0 followers

            TK

            Tomer Koren

            Speaker · 0 followers

            YM

            Yishay Mansour

            Speaker · 1 follower

            About

            We study reinforcement learning with linear function approximation and adversarially changing cost functions, a setup that has mostly been considered under simplifying assumptions such as full information feedback or exploratory conditions. We present a computationally efficient policy optimization algorithm for the challenging general setting of unknown dynamics and bandit feedback, featuring a combination of mirror-descent and least squares policy evaluation in an auxiliary MDP used to compute…

            Organizer

            I2
            I2

            ICML 2023

            Account · 657 followers

            Like the format? Trust SlidesLive to capture your next event!

            Professional recording and live streaming, delivered globally.

            Sharing

            Recommended Videos

            Presentations on similar topic, category or speaker

            Under-Counted Tensor Completion with Neural Side Information Learner
            04:49

            Under-Counted Tensor Completion with Neural Side Information Learner

            Shahana Ibrahim, …

            I2
            I2
            ICML 2023 2 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            Make-A-Video 3D (MAV3D): Text-To-4D Dynamic Scene Generation
            05:11

            Make-A-Video 3D (MAV3D): Text-To-4D Dynamic Scene Generation

            Uriel Singer, …

            I2
            I2
            ICML 2023 2 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            Effective and Efficient Structural Inference with Reservoir Computing
            05:19

            Effective and Efficient Structural Inference with Reservoir Computing

            Aoran Wang, …

            I2
            I2
            ICML 2023 2 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            Continual Learners are Incremental Model Generalizers
            04:51

            Continual Learners are Incremental Model Generalizers

            Jaehong Yoon, …

            I2
            I2
            ICML 2023 2 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            Margin-based sampling in high dimensions: When being active is less efficient than staying passive
            04:18

            Margin-based sampling in high dimensions: When being active is less efficient than staying passive

            Alexandru Tifrea, …

            I2
            I2
            ICML 2023 2 years ago

            Total of 1 viewers voted for saving the presentation to eternal vault which is 0.1%

            Deep Temporal Sets with Evidential Reinforced Attentions for Unique Behavioral Pattern Discovery
            05:03

            Deep Temporal Sets with Evidential Reinforced Attentions for Unique Behavioral Pattern Discovery

            Dingrong Wang, …

            I2
            I2
            ICML 2023 2 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            Interested in talks like this? Follow ICML 2023