Next
Livestream will start soon!
Livestream has already ended.
Presentation has not been recorded yet!
  • title: Reward is enough for convex MDPs
      0:00 / 0:00
      • Report Issue
      • Settings
      • Playlists
      • Bookmarks
      • Subtitles Off
      • Playback rate
      • Quality
      • Settings
      • Debug information
      • Server sl-yoda-v3-stream-014-alpha.b-cdn.net
      • Subtitles size Medium
      • Bookmarks
      • Server
      • sl-yoda-v3-stream-014-alpha.b-cdn.net
      • sl-yoda-v3-stream-014-beta.b-cdn.net
      • 1978117156.rsc.cdn77.org
      • 1243944885.rsc.cdn77.org
      • Subtitles
      • Off
      • English
      • Playback rate
      • Quality
      • Subtitles size
      • Large
      • Medium
      • Small
      • Mode
      • Video Slideshow
      • Audio Slideshow
      • Slideshow
      • Video
      My playlists
        Bookmarks
          00:00:00
            Reward is enough for convex MDPs
            • Settings
            • Sync diff
            • Quality
            • Settings
            • Server
            • Quality
            • Server

            Reward is enough for convex MDPs

            Dec 6, 2021

            Speakers

            TZ

            Tom Zahavy

            Speaker · 0 followers

            BO

            Brendan O´Donoghue

            Speaker · 0 followers

            GD

            Guillaume Desjardins

            Speaker · 0 followers

            About

            Maximising a cumulative reward function that is Markov and stationary, i.e., defined over state-action pairs and independent of time, is sufficient to capture many kinds of goals in a Markov Decision Process (MDP) based on the Reinforcement Learning (RL) problem formulation. However, not all goals can be captured in this manner. Specifically, it is easy to see that Convex MDPs in which goals are expressed as convex functions of stationary distributions cannot, in general, be formulated in this m…

            Organizer

            N2
            N2

            NeurIPS 2021

            Account · 1.9k followers

            About NeurIPS 2021

            Neural Information Processing Systems (NeurIPS) is a multi-track machine learning and computational neuroscience conference that includes invited talks, demonstrations, symposia and oral and poster presentations of refereed papers. Following the conference, there are workshops which provide a less formal setting.

            Like the format? Trust SlidesLive to capture your next event!

            Professional recording and live streaming, delivered globally.

            Sharing

            Recommended Videos

            Presentations on similar topic, category or speaker

            Self-Adaptable Point Processes with Nonparametric Time Decays
            10:01

            Self-Adaptable Point Processes with Nonparametric Time Decays

            Zhimeng Pan, …

            N2
            N2
            NeurIPS 2021 3 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            How Fine-Tuning Allows for Effective Meta-Learning
            10:40

            How Fine-Tuning Allows for Effective Meta-Learning

            Kurtland Chua, …

            N2
            N2
            NeurIPS 2021 3 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            A Universal Law of Robustness via Isoperimetry
            18:54

            A Universal Law of Robustness via Isoperimetry

            Mark Sellke, …

            N2
            N2
            NeurIPS 2021 3 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            Dense Unsupervised Learning for Video Segmentation
            13:34

            Dense Unsupervised Learning for Video Segmentation

            Nikita Araslanov, …

            N2
            N2
            NeurIPS 2021 3 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            Terra: Imperative-Symbolic Co-Execution of Imperative Deep Learning Programs
            14:02

            Terra: Imperative-Symbolic Co-Execution of Imperative Deep Learning Programs

            Taebum Kim, …

            N2
            N2
            NeurIPS 2021 3 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            SSMF: Shifting Seasonal Matrix Factorization
            05:57

            SSMF: Shifting Seasonal Matrix Factorization

            Koki Kawabata, …

            N2
            N2
            NeurIPS 2021 3 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            Interested in talks like this? Follow NeurIPS 2021