Next
Livestream will start soon!
Livestream has already ended.
Presentation has not been recorded yet!
  • title: Optimal Uniform OPE and Model-based Offline Reinforcement Learning in Time-Homogeneous, Reward-Free and Task-Agnostic Settings
      0:00 / 0:00
      • Report Issue
      • Settings
      • Playlists
      • Bookmarks
      • Subtitles Off
      • Playback rate
      • Quality
      • Settings
      • Debug information
      • Server sl-yoda-v3-stream-011-alpha.b-cdn.net
      • Subtitles size Medium
      • Bookmarks
      • Server
      • sl-yoda-v3-stream-011-alpha.b-cdn.net
      • sl-yoda-v3-stream-011-beta.b-cdn.net
      • 1150868944.rsc.cdn77.org
      • 1511650057.rsc.cdn77.org
      • Subtitles
      • Off
      • English
      • Playback rate
      • Quality
      • Subtitles size
      • Large
      • Medium
      • Small
      • Mode
      • Video Slideshow
      • Audio Slideshow
      • Slideshow
      • Video
      My playlists
        Bookmarks
          00:00:00
            Optimal Uniform OPE and Model-based Offline Reinforcement Learning in Time-Homogeneous, Reward-Free and Task-Agnostic Settings
            • Settings
            • Sync diff
            • Quality
            • Settings
            • Server
            • Quality
            • Server

            Optimal Uniform OPE and Model-based Offline Reinforcement Learning in Time-Homogeneous, Reward-Free and Task-Agnostic Settings

            Dec 6, 2021

            Speakers

            MY

            Ming Yin

            Speaker · 0 followers

            YW

            Yu-Xiang Wang

            Speaker · 0 followers

            About

            This work studies the statistical limits of uniform convergence for offline policy evaluation (OPE) problems with model-based methods (for episodic MDP) and provides a unified framework towards optimal learning for several well-motivated offline tasks. Uniform OPE sup_Π|Q^π-Q̂^π|<ϵ is a stronger measure than the point-wise OPE and ensures offline learning when Π contains all policies (the global class). In this paper, we establish an Ω(H^2 S/d_mϵ^2) lower bound (over model-based family) for …

            Organizer

            N2
            N2

            NeurIPS 2021

            Account · 1.9k followers

            About NeurIPS 2021

            Neural Information Processing Systems (NeurIPS) is a multi-track machine learning and computational neuroscience conference that includes invited talks, demonstrations, symposia and oral and poster presentations of refereed papers. Following the conference, there are workshops which provide a less formal setting.

            Like the format? Trust SlidesLive to capture your next event!

            Professional recording and live streaming, delivered globally.

            Sharing

            Recommended Videos

            Presentations on similar topic, category or speaker

            Deep learning techniques for a real-time neutrino classifier
            02:35

            Deep learning techniques for a real-time neutrino classifier

            Astrid Anker

            N2
            N2
            NeurIPS 2021 3 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            Ising Model Selection Using l1-Regularized Linear Regression: A Statistical Mechanics Analysis
            09:36

            Ising Model Selection Using l1-Regularized Linear Regression: A Statistical Mechanics Analysis

            Xiangming Meng, …

            N2
            N2
            NeurIPS 2021 3 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            Computer-Aided Design as Language
            15:08

            Computer-Aided Design as Language

            Yaroslav Ganin, …

            N2
            N2
            NeurIPS 2021 3 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            Automatic Symmetry Discovery with Lie Algebra Convolutional Network
            14:42

            Automatic Symmetry Discovery with Lie Algebra Convolutional Network

            Nima Dehmamy, …

            N2
            N2
            NeurIPS 2021 3 years ago

            Total of 1 viewers voted for saving the presentation to eternal vault which is 0.1%

            IRM - when it works and when it doesn't: A test case of natural language inference
            11:17

            IRM - when it works and when it doesn't: A test case of natural language inference

            Yana Dranker, …

            N2
            N2
            NeurIPS 2021 3 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            Kernelized Heterogeneous Risk Minimization
            11:11

            Kernelized Heterogeneous Risk Minimization

            Jiashuo Liu, …

            N2
            N2
            NeurIPS 2021 3 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            Interested in talks like this? Follow NeurIPS 2021