Next
Livestream will start soon!
Livestream has already ended.
Presentation has not been recorded yet!
  • title: Identifiability and generalizability from multiple experts in Inverse Reinforcement Learning
      0:00 / 0:00
      • Report Issue
      • Settings
      • Playlists
      • Bookmarks
      • Subtitles Off
      • Playback rate
      • Quality
      • Settings
      • Debug information
      • Server sl-yoda-v2-stream-005-alpha.b-cdn.net
      • Subtitles size Medium
      • Bookmarks
      • Server
      • sl-yoda-v2-stream-005-alpha.b-cdn.net
      • sl-yoda-v2-stream-005-beta.b-cdn.net
      • 1034628162.rsc.cdn77.org
      • 1409346856.rsc.cdn77.org
      • Subtitles
      • Off
      • English
      • Playback rate
      • Quality
      • Subtitles size
      • Large
      • Medium
      • Small
      • Mode
      • Video Slideshow
      • Audio Slideshow
      • Slideshow
      • Video
      My playlists
        Bookmarks
          00:00:00
            Identifiability and generalizability from multiple experts in Inverse Reinforcement Learning
            • Settings
            • Sync diff
            • Quality
            • Settings
            • Server
            • Quality
            • Server

            Identifiability and generalizability from multiple experts in Inverse Reinforcement Learning

            Nov 28, 2022

            Sprecher:innen

            PR

            Paul Rolland

            Sprecher:in · 0 Follower:innen

            LV

            Luca Viano

            Sprecher:in · 0 Follower:innen

            NS

            Norman Schürhoff

            Sprecher:in · 0 Follower:innen

            Über

            While Reinforcement Learning (RL) aims to train an agent from a reward function in a given environment, Inverse Reinforcement Learning (IRL) seeks to recover the reward function from observing an expert's behavior. It is well known that, in general, various reward functions can lead to the same optimal policy, and hence, IRL is ill-defined. However, <cit.> showed that, if we observe two or more experts with different discount factors or acting in different environments, the reward functio…

            Organisator

            N2
            N2

            NeurIPS 2022

            Konto · 960 Follower:innen

            Gefällt euch das Format? Vertraut auf SlidesLive, um euer nächstes Event festzuhalten!

            Professionelle Aufzeichnung und Livestreaming – weltweit.

            Freigeben

            Empfohlene Videos

            Präsentationen, deren Thema, Kategorie oder Sprecher:in ähnlich sind

            Basis Encoded Polynomial Neural Fields for Subband Decomposition
            05:01

            Basis Encoded Polynomial Neural Fields for Subband Decomposition

            Guandao Yang, …

            N2
            N2
            NeurIPS 2022 2 years ago

            Ewigspeicher-Fortschrittswert: 0 = 0.0%

            Robust Learning against Relational Adversaries
            01:02

            Robust Learning against Relational Adversaries

            Yizhen Wang, …

            N2
            N2
            NeurIPS 2022 2 years ago

            Ewigspeicher-Fortschrittswert: 0 = 0.0%

            Minimax-Optimal Multi-Agent RL in Markov Games With a Generative Model
            05:13

            Minimax-Optimal Multi-Agent RL in Markov Games With a Generative Model

            Gen Li, …

            N2
            N2
            NeurIPS 2022 2 years ago

            Ewigspeicher-Fortschrittswert: 0 = 0.0%

            The developments of the theory of Mixup
            26:48

            The developments of the theory of Mixup

            Kenji Kawaguchi

            N2
            N2
            NeurIPS 2022 2 years ago

            Ewigspeicher-Fortschrittswert: 0 = 0.0%

            UniGAN: Reducing Mode Collapse in GANs using a Uniform Generator
            05:04

            UniGAN: Reducing Mode Collapse in GANs using a Uniform Generator

            Ziqi Pan, …

            N2
            N2
            NeurIPS 2022 2 years ago

            Ewigspeicher-Fortschrittswert: 0 = 0.0%

            AutoLink: Self-supervised Learning of Human Skeletons and Object Outlines by Linking Keypoints
            02:54

            AutoLink: Self-supervised Learning of Human Skeletons and Object Outlines by Linking Keypoints

            Xingzhe He, …

            N2
            N2
            NeurIPS 2022 2 years ago

            Ewigspeicher-Fortschrittswert: 0 = 0.0%

            Interessiert an Vorträgen wie diesem? NeurIPS 2022 folgen