Next
Livestream will start soon!
Livestream has already ended.
Presentation has not been recorded yet!
  • title: Efficient Rate Optimal Regret for Adversarial Contextual MDPs Using Online Function Approximation
      0:00 / 0:00
      • Report Issue
      • Settings
      • Playlists
      • Bookmarks
      • Subtitles Off
      • Playback rate
      • Quality
      • Settings
      • Debug information
      • Server sl-yoda-v2-stream-007-alpha.b-cdn.net
      • Subtitles size Medium
      • Bookmarks
      • Server
      • sl-yoda-v2-stream-007-alpha.b-cdn.net
      • sl-yoda-v2-stream-007-beta.b-cdn.net
      • 1678031076.rsc.cdn77.org
      • 1932936657.rsc.cdn77.org
      • Subtitles
      • Off
      • English
      • Playback rate
      • Quality
      • Subtitles size
      • Large
      • Medium
      • Small
      • Mode
      • Video Slideshow
      • Audio Slideshow
      • Slideshow
      • Video
      My playlists
        Bookmarks
          00:00:00
            Efficient Rate Optimal Regret for Adversarial Contextual MDPs Using Online Function Approximation
            • Settings
            • Sync diff
            • Quality
            • Settings
            • Server
            • Quality
            • Server

            Efficient Rate Optimal Regret for Adversarial Contextual MDPs Using Online Function Approximation

            Jul 24, 2023

            Speakers

            OL

            Orin Levy

            Speaker · 0 followers

            AC

            Alon Cohen

            Speaker · 0 followers

            AC

            Asaf Cassel

            Speaker · 0 followers

            About

            We present the OMG-CMDP! algorithm for regret minimization in adversarial Contextual MDPs. The algorithm operates under the minimal assumptions of realizable function class and access to online least squares and log loss regression oracles. Our algorithm is efficient (assuming efficient online regression oracles), simple and robust to approximation errors. It enjoys an O(H^2.5√( T|S||A| ( ℛ(𝒪)+ H log(δ^-1) ))) regret guarantee, with T being the number of episodes, S the state space, A the actio…

            Organizer

            I2
            I2

            ICML 2023

            Account · 617 followers

            Like the format? Trust SlidesLive to capture your next event!

            Professional recording and live streaming, delivered globally.

            Sharing

            Recommended Videos

            Presentations on similar topic, category or speaker

            Nonparametric Generative Modeling with Conditional Sliced-Wasserstein Flows
            05:12

            Nonparametric Generative Modeling with Conditional Sliced-Wasserstein Flows

            Chao Du, …

            I2
            I2
            ICML 2023 2 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            Data Poisoning Attacks Against Multimodal Encoders
            05:16

            Data Poisoning Attacks Against Multimodal Encoders

            Ziqing Yang, …

            I2
            I2
            ICML 2023 2 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            On the Relationship Between Explanation and Prediction: A Causal View
            05:43

            On the Relationship Between Explanation and Prediction: A Causal View

            Amir-Hossein Karimi, …

            I2
            I2
            ICML 2023 2 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            Denoising MCMC for Accelerating Diffusion-Based Generative Models
            04:06

            Denoising MCMC for Accelerating Diffusion-Based Generative Models

            Beomsu Kim, …

            I2
            I2
            ICML 2023 2 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            Do the Rewards Justify the Means? Measuring Trade-Offs Between Rewards and Ethical Behavior in the Machiavelli Benchmark
            08:26

            Do the Rewards Justify the Means? Measuring Trade-Offs Between Rewards and Ethical Behavior in the Machiavelli Benchmark

            Alexander Pan, …

            I2
            I2
            ICML 2023 2 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            OCD: Learning to Overfit with Conditional Diffusion Models
            05:41

            OCD: Learning to Overfit with Conditional Diffusion Models

            Shahar Lutati, …

            I2
            I2
            ICML 2023 2 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            Interested in talks like this? Follow ICML 2023