Paul Rolland, Luca Viano, Norman Schürhoff, Boris Nikolov, Volkan Cevher · Identifiability and generalizability from multiple experts in Inverse Reinforcement Learning · SlidesLive

Kategorien

DE

Anmelden Vertrieb kontaktieren

Next

Livestream will start soon!

Livestream has already ended.

Presentation has not been recorded yet!

SlidesLive

title: Identifiability and generalizability from multiple experts in Inverse Reinforcement Learning

0:00 / 0:00

Report Issue
Settings
Playlists
Bookmarks
Subtitles Off
Playback rate
Quality

Settings
Debug information
Server sl-yoda-v2-stream-005-alpha.b-cdn.net
Subtitles size Medium

Bookmarks

Server
sl-yoda-v2-stream-005-alpha.b-cdn.net
sl-yoda-v2-stream-005-beta.b-cdn.net
1034628162.rsc.cdn77.org
1409346856.rsc.cdn77.org

Subtitles
Off
English

Playback rate

Quality

Subtitles size
Large
Medium
Small

Mode
Video Slideshow
Audio Slideshow
Slideshow
Video

Identifiability and generalizability from multiple experts in Inverse Reinforcement Learning

Identifiability and generalizability from multiple experts in Inverse Reinforcement Learning

Nov 28, 2022

Sprecher:innen

Paul Rolland

Sprecher:in · 0 Follower:innen

Luca Viano

Sprecher:in · 0 Follower:innen

Norman Schürhoff

Sprecher:in · 0 Follower:innen

Über

While Reinforcement Learning (RL) aims to train an agent from a reward function in a given environment, Inverse Reinforcement Learning (IRL) seeks to recover the reward function from observing an expert's behavior. It is well known that, in general, various reward functions can lead to the same optimal policy, and hence, IRL is ill-defined. However, <cit.> showed that, if we observe two or more experts with different discount factors or acting in different environments, the reward functio…

Organisator

NeurIPS 2022

Konto · 960 Follower:innen

Gefällt euch das Format? Vertraut auf SlidesLive, um euer nächstes Event festzuhalten!

Professionelle Aufzeichnung und Livestreaming – weltweit.

Freigeben

Empfohlene Videos

Präsentationen, deren Thema, Kategorie oder Sprecher:in ähnlich sind

Basis Encoded Polynomial Neural Fields for Subband Decomposition

05:01

Basis Encoded Polynomial Neural Fields for Subband Decomposition

Später ansehen

Favorit

Guandao Yang, …

NeurIPS 2022 2 years ago

Robust Learning against Relational Adversaries

01:02

Robust Learning against Relational Adversaries

Später ansehen

Favorit

Yizhen Wang, …

NeurIPS 2022 2 years ago

Minimax-Optimal Multi-Agent RL in Markov Games With a Generative Model

05:13

Minimax-Optimal Multi-Agent RL in Markov Games With a Generative Model

Später ansehen

Favorit

NeurIPS 2022 2 years ago

The developments of the theory of Mixup

26:48

The developments of the theory of Mixup

Später ansehen

Favorit

Kenji Kawaguchi

NeurIPS 2022 2 years ago

UniGAN: Reducing Mode Collapse in GANs using a Uniform Generator

05:04

UniGAN: Reducing Mode Collapse in GANs using a Uniform Generator

Später ansehen

Favorit

NeurIPS 2022 2 years ago

AutoLink: Self-supervised Learning of Human Skeletons and Object Outlines by Linking Keypoints

02:54

AutoLink: Self-supervised Learning of Human Skeletons and Object Outlines by Linking Keypoints

Später ansehen

Favorit

Xingzhe He, …

NeurIPS 2022 2 years ago