Next
Livestream will start soon!
Livestream has already ended.
Presentation has not been recorded yet!
  • title: Boosting Offline Reinforcement Learning with Action Preference Query
      0:00 / 0:00
      • Report Issue
      • Settings
      • Playlists
      • Bookmarks
      • Subtitles Off
      • Playback rate
      • Quality
      • Settings
      • Debug information
      • Server sl-yoda-v2-stream-010-alpha.b-cdn.net
      • Subtitles size Medium
      • Bookmarks
      • Server
      • sl-yoda-v2-stream-010-alpha.b-cdn.net
      • sl-yoda-v2-stream-010-beta.b-cdn.net
      • 1759419103.rsc.cdn77.org
      • 1016618226.rsc.cdn77.org
      • Subtitles
      • Off
      • English
      • Playback rate
      • Quality
      • Subtitles size
      • Large
      • Medium
      • Small
      • Mode
      • Video Slideshow
      • Audio Slideshow
      • Slideshow
      • Video
      My playlists
        Bookmarks
          00:00:00
            Boosting Offline Reinforcement Learning with Action Preference Query
            • Settings
            • Sync diff
            • Quality
            • Settings
            • Server
            • Quality
            • Server

            Boosting Offline Reinforcement Learning with Action Preference Query

            Jul 24, 2023

            Sprecher:innen

            QY

            Qisen Yang

            Sprecher:in · 0 Follower:innen

            SW

            Shenzhi Wang

            Sprecher:in · 0 Follower:innen

            MGL

            Matthieu Gaetan Lin

            Sprecher:in · 0 Follower:innen

            Über

            Training practical agents usually involve offline and online reinforcement learning (RL) to balance the policy's performance and interaction costs. In particular, online fine-tuning has become a commonly used method to correct the erroneous estimates of out-of-distribution data learned in the offline training phase. However, even limited online interactions can be inaccessible or catastrophic for high-stake scenarios like healthcare and autonomous driving. In this work, we introduce an interacti…

            Organisator

            I2
            I2

            ICML 2023

            Konto · 657 Follower:innen

            Gefällt euch das Format? Vertraut auf SlidesLive, um euer nächstes Event festzuhalten!

            Professionelle Aufzeichnung und Livestreaming – weltweit.

            Freigeben

            Empfohlene Videos

            Präsentationen, deren Thema, Kategorie oder Sprecher:in ähnlich sind

            Preference Proxies: Evaluating LLMs in capturing Human Preferences in Human-AI Tasks
            12:10

            Preference Proxies: Evaluating LLMs in capturing Human Preferences in Human-AI Tasks

            Mudit Verma, …

            I2
            I2
            ICML 2023 2 years ago

            Ewigspeicher-Fortschrittswert: 0 = 0.0%

            Differentiable Tree Operations Promote Compositional Generalization
            05:13

            Differentiable Tree Operations Promote Compositional Generalization

            Paul Soulos, …

            I2
            I2
            ICML 2023 2 years ago

            Ewigspeicher-Fortschrittswert: 0 = 0.0%

            Pix2Struct: Screenshot Parsing as Pretraining for Visual Language Understanding
            07:55

            Pix2Struct: Screenshot Parsing as Pretraining for Visual Language Understanding

            Kenton Lee, …

            I2
            I2
            ICML 2023 2 years ago

            Ewigspeicher-Fortschrittswert: 0 = 0.0%

            Partial Optimality in Cubic Correlation Clustering
            04:34

            Partial Optimality in Cubic Correlation Clustering

            David Stein, …

            I2
            I2
            ICML 2023 2 years ago

            Ewigspeicher-Fortschrittswert: 0 = 0.0%

            Welcome to the Indigenous in AI Workshop
            10:52

            Welcome to the Indigenous in AI Workshop

            ʻŌiwi Parker Jones

            I2
            I2
            ICML 2023 2 years ago

            Ewigspeicher-Fortschrittswert: 0 = 0.0%

            The Domain Generalization Issue in Data-Based Dynamical Models
            32:46

            The Domain Generalization Issue in Data-Based Dynamical Models

            Patrick Gallinari

            I2
            I2
            ICML 2023 2 years ago

            Ewigspeicher-Fortschrittswert: 0 = 0.0%

            Interessiert an Vorträgen wie diesem? ICML 2023 folgen