Changnan Xiao, Haosen Shi, Jiajun Fan, Shihong Deng, Haiyan Yin · CASA: Bridging the Gap between Policy Improvement and Policy Evaluation with Conflict Averse Policy Iteration · SlidesLive

Categories

EN

Log in Talk to sales

Next

Livestream will start soon!

Livestream has already ended.

Presentation has not been recorded yet!

SlidesLive

title: CASA: Bridging the Gap between Policy Improvement and Policy Evaluation with Conflict Averse Policy Iteration

0:00 / 0:00

Report Issue
Settings
Playlists
Bookmarks
Subtitles Off
Playback rate
Quality

Settings
Debug information
Server sl-yoda-v2-stream-005-alpha.b-cdn.net
Subtitles size Medium

Bookmarks

Server
sl-yoda-v2-stream-005-alpha.b-cdn.net
sl-yoda-v2-stream-005-beta.b-cdn.net
1034628162.rsc.cdn77.org
1409346856.rsc.cdn77.org

Subtitles
Off
English

Playback rate

Quality

Subtitles size
Large
Medium
Small

Mode
Video Slideshow
Audio Slideshow
Slideshow
Video

CASA: Bridging the Gap between Policy Improvement and Policy Evaluation with Conflict Averse Policy Iteration

CASA: Bridging the Gap between Policy Improvement and Policy Evaluation with Conflict Averse Policy Iteration

Dec 2, 2022

Speakers

Changnan Xiao

Sprecher:in · 0 Follower:innen

Haosen Shi

Sprecher:in · 0 Follower:innen

Jiajun Fan

Sprecher:in · 0 Follower:innen

About

We study the problem of model-free reinforcement learning, which is often solved following the principle of Generalized Policy Iteration (GPI). While GPI is typically an interplay between policy evaluation and policy improvement, most conventional model-free methods with function approximation assume the independence of GPI steps, despite of the inherent connections between them. In this paper, we present a method that attempts to eliminate the inconsistency between policy evaluation step and po…

Organizer

NeurIPS 2022

Konto · 961 Follower:innen

Like the format? Trust SlidesLive to capture your next event!

Professional recording and live streaming, delivered globally.

Sharing

Recommended Videos

Presentations on similar topic, category or speaker

Learn to Match with No Regret: Reinforcement Learning in Markov Matching Markets

04:13

Learn to Match with No Regret: Reinforcement Learning in Markov Matching Markets

Später ansehen

Favorit

NeurIPS 2022 2 years ago

CalFAT: Calibrated Federated Adversarial Training with Label Skewness

04:57

CalFAT: Calibrated Federated Adversarial Training with Label Skewness

Später ansehen

Favorit

NeurIPS 2022 2 years ago

Keynote 5: Building human-centric AI systems: thoughts on user agency, transparency and trust

21:09

Keynote 5: Building human-centric AI systems: thoughts on user agency, transparency and trust

Später ansehen

Favorit

Fernanda Viégas

NeurIPS 2022 2 years ago

DetCLIP: Dictionary-Enriched Visual-Concept Paralleled Pre-training for Open-world Detection

04:49

DetCLIP: Dictionary-Enriched Visual-Concept Paralleled Pre-training for Open-world Detection

Später ansehen

Favorit

NeurIPS 2022 2 years ago

VRL3: A Data-Driven Framework for Visual Deep Reinforcement Learning

04:54

VRL3: A Data-Driven Framework for Visual Deep Reinforcement Learning

Später ansehen

Favorit

NeurIPS 2022 2 years ago

Self-supervised learning of brain dynamics from broad neuroimaging data

05:04

Self-supervised learning of brain dynamics from broad neuroimaging data

Später ansehen

Favorit

Armin W. Thomas, …

NeurIPS 2022 2 years ago