Jeongyeol Kwon, Yonathan Efroni, Constantine Caramanis, Shie Mannor · Reinforcement Learning in Reward-Mixing MDPs · SlidesLive

Kategorie

CS

Přihlásit se Kontaktujte nás

Další

Živý přenos začne již brzy!

Živý přenos již skončil.

Prezentace ještě nebyla nahrána!

SlidesLive

title: Reinforcement Learning in Reward-Mixing MDPs

0:00 / 0:00

Nahlásit chybu
Nastavení
Playlisty
Záložky
Titulky Off
Rychlost přehrávání
Kvalita

Nastavení
Debug informace
Server sl-yoda-v3-stream-015-alpha.b-cdn.net
Velikost titulků Střední

Záložky

Server
sl-yoda-v3-stream-015-alpha.b-cdn.net
sl-yoda-v3-stream-015-beta.b-cdn.net
1963568160.rsc.cdn77.org
1940033649.rsc.cdn77.org

Titulky
Off
English

Rychlost přehrávání

Kvalita

Velikost titulků
Velké
Střední
Malé

Mode
Video Slideshow
Audio Slideshow
Slideshow
Video

Reinforcement Learning in Reward-Mixing MDPs

Reinforcement Learning in Reward-Mixing MDPs

6. prosince 2021

Řečníci

Jeongyeol Kwon

Řečník · 0 sledujících

Yonathan Efroni

Řečník · 0 sledujících

Constantine Caramanis

Řečník · 0 sledujících

O prezentaci

Learning a near optimal policy in a partially observable system remains an elusive challenge in contemporary reinforcement learning. In this work, we consider episodic reinforcement learning in a reward-mixing Markov decision process (MDP). There, a reward function is drawn from one of M possible reward models at the beginning of every episode, but the identity of the chosen reward model is not revealed to the agent. Hence, the latent state space, for which the dynamics are Markovian, is not giv…

Organizátor

NeurIPS 2021

Účet · 1,9k sledujících

O organizátorovi (NeurIPS 2021)

Neural Information Processing Systems (NeurIPS) is a multi-track machine learning and computational neuroscience conference that includes invited talks, demonstrations, symposia and oral and poster presentations of refereed papers. Following the conference, there are workshops which provide a less formal setting.

Baví vás formát? Nechte SlidesLive zachytit svou akci!

Profesionální natáčení a streamování po celém světě.

Sdílení

Doporučená videa

Prezentace na podobné téma, kategorii nebo přednášejícího

Open Data Sharing and Indigenous Genomic Data Governance

32:29

Open Data Sharing and Indigenous Genomic Data Governance

Zhlédnout později

Oblíbené

NeurIPS 2021 3 years ago

Sample Selection for Fair and Robust Training

13:44

Sample Selection for Fair and Robust Training

Zhlédnout později

Oblíbené

NeurIPS 2021 3 years ago

Learning to Synthesize Programs as Interpretable and Generalizable Policies

18:14

Learning to Synthesize Programs as Interpretable and Generalizable Policies

Zhlédnout později

Oblíbené

Dweep Trivedi, …

NeurIPS 2021 3 years ago

Predicting Molecular Conformation via Dynamic Graph Score Matching

08:37

Predicting Molecular Conformation via Dynamic Graph Score Matching

Zhlédnout později

Oblíbené

Shitong Luo, …

NeurIPS 2021 3 years ago

Continual Density Ratio Estimation

05:47

Continual Density Ratio Estimation

Zhlédnout později

Oblíbené

NeurIPS 2021 3 years ago

GRIN: Generative Relation and Intention Network for Multi-agent Trajectory Prediction

04:28

GRIN: Generative Relation and Intention Network for Multi-agent Trajectory Prediction

Zhlédnout později

Oblíbené

Longyuan Li, …

NeurIPS 2021 3 years ago