Kamyar Ghasemipour, Shixiang Gu, Ofir Nachum · Why So Pessimistic? Estimating uncertainties for offline RL through Ensembles, and why their independence matters · SlidesLive

Categories

EN

Log in Talk to sales

Next

Livestream will start soon!

Livestream has already ended.

Presentation has not been recorded yet!

SlidesLive

title: Why So Pessimistic? Estimating uncertainties for offline RL through Ensembles, and why their independence matters

0:00 / 0:00

Report Issue
Settings
Playlists
Bookmarks
Subtitles Off
Playback rate
Quality

Settings
Debug information
Server sl-yoda-v2-stream-005-alpha.b-cdn.net
Subtitles size Medium

Bookmarks

Server
sl-yoda-v2-stream-005-alpha.b-cdn.net
sl-yoda-v2-stream-005-beta.b-cdn.net
1034628162.rsc.cdn77.org
1409346856.rsc.cdn77.org

Subtitles
Off
English

Playback rate

Quality

Subtitles size
Large
Medium
Small

Mode
Video Slideshow
Audio Slideshow
Slideshow
Video

Why So Pessimistic? Estimating uncertainties for offline RL through Ensembles, and why their independence matters

Why So Pessimistic? Estimating uncertainties for offline RL through Ensembles, and why their independence matters

Nov 28, 2022

Speakers

Kamyar Ghasemipour

Speaker · 0 followers

Shixiang Gu

Speaker · 0 followers

Ofir Nachum

Speaker · 2 followers

About

Motivated by the success of ensembles for uncertainty estimation in supervised learning, we take a renewed look at how ensembles of Q-functions can be leveraged as the primary source of pessimism for offline reinforcement learning (RL). We begin by identifying a critical flaw in a popular algorithmic choice used by many ensemble-based RL algorithms, namely the use of shared pessimistic target values when computing each ensemble member’s Bellman error. Through theoretical analyses and constructio…

Organizer

NeurIPS 2022

Account · 952 followers

Like the format? Trust SlidesLive to capture your next event!

Professional recording and live streaming, delivered globally.

Sharing

Recommended Videos

Presentations on similar topic, category or speaker

A Future for AI Governance Systems beyond Predictions

05:39

A Future for AI Governance Systems beyond Predictions

Watch later

Favorite

Devansh Saxena, …

NeurIPS 2022 2 years ago

Robust Models are less Over-Confident

04:45

Robust Models are less Over-Confident

Watch later

Favorite

Julia Grabinski, …

NeurIPS 2022 2 years ago

Hybrid RL: Using Both Offline and Online Data can Make RL Efficient

12:44

Hybrid RL: Using Both Offline and Online Data can Make RL Efficient

Watch later

Favorite

NeurIPS 2022 2 years ago

SurDis: A Surface Discontinuity Dataset for Wearable Technology to Assist Blind Navigation in Urban Environments

05:36

SurDis: A Surface Discontinuity Dataset for Wearable Technology to Assist Blind Navigation in Urban Environments

Watch later

Favorite

Kuan Yew Leong, …

NeurIPS 2022 2 years ago

Addressing Bias in Face Detectors using Decentralised Data collection with incentives

11:39

Addressing Bias in Face Detectors using Decentralised Data collection with incentives

Watch later

Favorite

NeurIPS 2022 2 years ago

Can Querying for Bias Leak Protected Attributes? Achieving Privacy With Smooth Sensitivity

03:03

Can Querying for Bias Leak Protected Attributes? Achieving Privacy With Smooth Sensitivity

Watch later

Favorite

Faisal Hamman, …

NeurIPS 2022 2 years ago