Next
Livestream will start soon!
Livestream has already ended.
Presentation has not been recorded yet!
  • title: Max-Min Off-Policy Actor-Critic Method Focusing on Worst-Case Robustness to Model Misspecification
      0:00 / 0:00
      • Report Issue
      • Settings
      • Playlists
      • Bookmarks
      • Subtitles Off
      • Playback rate
      • Quality
      • Settings
      • Debug information
      • Server sl-yoda-v2-stream-001-alpha.b-cdn.net
      • Subtitles size Medium
      • Bookmarks
      • Server
      • sl-yoda-v2-stream-001-alpha.b-cdn.net
      • sl-yoda-v2-stream-001-beta.b-cdn.net
      • 1824830694.rsc.cdn77.org
      • 1979322955.rsc.cdn77.org
      • Subtitles
      • Off
      • English
      • Playback rate
      • Quality
      • Subtitles size
      • Large
      • Medium
      • Small
      • Mode
      • Video Slideshow
      • Audio Slideshow
      • Slideshow
      • Video
      My playlists
        Bookmarks
          00:00:00
            Max-Min Off-Policy Actor-Critic Method Focusing on Worst-Case Robustness to Model Misspecification
            • Settings
            • Sync diff
            • Quality
            • Settings
            • Server
            • Quality
            • Server

            Max-Min Off-Policy Actor-Critic Method Focusing on Worst-Case Robustness to Model Misspecification

            Nov 28, 2022

            Sprecher:innen

            TT

            Takumi Tanabe

            Sprecher:in · 0 Follower:innen

            RS

            Rei Sato

            Sprecher:in · 0 Follower:innen

            KF

            Kazuto Fukuchi

            Sprecher:in · 0 Follower:innen

            Über

            In the field of reinforcement learning, owing to the high cost and risk of policy training in the real world, policies trained in a simulation environment are often transferred corresponding real-world environment.However, the simulation environment does not perfectly mimic the real-world environment, leading to model misspecification occurs. Multiple studies report significant deterioration of policy performance in a real-world environment.In this study, we focus on scenarios involving a simula…

            Organisator

            N2
            N2

            NeurIPS 2022

            Konto · 961 Follower:innen

            Gefällt euch das Format? Vertraut auf SlidesLive, um euer nächstes Event festzuhalten!

            Professionelle Aufzeichnung und Livestreaming – weltweit.

            Freigeben

            Empfohlene Videos

            Präsentationen, deren Thema, Kategorie oder Sprecher:in ähnlich sind

            Picking on the Same Person: Does Algorithmic Monoculture lead to Outcome Homogenization?
            06:30

            Picking on the Same Person: Does Algorithmic Monoculture lead to Outcome Homogenization?

            Rishi Bommasani, …

            N2
            N2
            NeurIPS 2022 2 years ago

            Ewigspeicher-Fortschrittswert: 0 = 0.0%

            Improving Multi-Task Generalization via Regularizing Spurious Correlation
            05:09

            Improving Multi-Task Generalization via Regularizing Spurious Correlation

            Ziniu Hu, …

            N2
            N2
            NeurIPS 2022 2 years ago

            Ewigspeicher-Fortschrittswert: 0 = 0.0%

            Non-Gaussian Tensor Programs
            01:02

            Non-Gaussian Tensor Programs

            Eugene Golikov, …

            N2
            N2
            NeurIPS 2022 2 years ago

            Ewigspeicher-Fortschrittswert: 0 = 0.0%

            Semi-Supervised Generative Models for Multiagent Trajectories
            04:39

            Semi-Supervised Generative Models for Multiagent Trajectories

            Dennis Fassmeyer, …

            N2
            N2
            NeurIPS 2022 2 years ago

            Ewigspeicher-Fortschrittswert: 0 = 0.0%

            The Lakota AI Code Camp
            1:04:11

            The Lakota AI Code Camp

            Michael Running Wolf, …

            N2
            N2
            NeurIPS 2022 2 years ago

            Ewigspeicher-Fortschrittswert: 0 = 0.0%

            Adapting to Online Label Shift with Provable Guarantees
            05:01

            Adapting to Online Label Shift with Provable Guarantees

            Yong Bai, …

            N2
            N2
            NeurIPS 2022 2 years ago

            Ewigspeicher-Fortschrittswert: 0 = 0.0%

            Interessiert an Vorträgen wie diesem? NeurIPS 2022 folgen