Qisen Yang, Shenzhi Wang, Matthieu Gaetan Lin, Shiji Song, Gao Huang · Boosting Offline Reinforcement Learning with Action Preference Query · SlidesLive

Kategorien

DE

Anmelden Vertrieb kontaktieren

Next

Livestream will start soon!

Livestream has already ended.

Presentation has not been recorded yet!

SlidesLive

title: Boosting Offline Reinforcement Learning with Action Preference Query

0:00 / 0:00

Report Issue
Settings
Playlists
Bookmarks
Subtitles Off
Playback rate
Quality

Settings
Debug information
Server sl-yoda-v2-stream-010-alpha.b-cdn.net
Subtitles size Medium

Bookmarks

Server
sl-yoda-v2-stream-010-alpha.b-cdn.net
sl-yoda-v2-stream-010-beta.b-cdn.net
1759419103.rsc.cdn77.org
1016618226.rsc.cdn77.org

Subtitles
Off
English

Playback rate

Quality

Subtitles size
Large
Medium
Small

Mode
Video Slideshow
Audio Slideshow
Slideshow
Video

Boosting Offline Reinforcement Learning with Action Preference Query

Boosting Offline Reinforcement Learning with Action Preference Query

Jul 24, 2023

Sprecher:innen

Qisen Yang

Sprecher:in · 0 Follower:innen

Shenzhi Wang

Sprecher:in · 0 Follower:innen

Matthieu Gaetan Lin

Sprecher:in · 0 Follower:innen

Über

Training practical agents usually involve offline and online reinforcement learning (RL) to balance the policy's performance and interaction costs. In particular, online fine-tuning has become a commonly used method to correct the erroneous estimates of out-of-distribution data learned in the offline training phase. However, even limited online interactions can be inaccessible or catastrophic for high-stake scenarios like healthcare and autonomous driving. In this work, we introduce an interacti…

Organisator

ICML 2023

Konto · 657 Follower:innen

Gefällt euch das Format? Vertraut auf SlidesLive, um euer nächstes Event festzuhalten!

Professionelle Aufzeichnung und Livestreaming – weltweit.

Freigeben

Empfohlene Videos

Präsentationen, deren Thema, Kategorie oder Sprecher:in ähnlich sind

Preference Proxies: Evaluating LLMs in capturing Human Preferences in Human-AI Tasks

12:10

Preference Proxies: Evaluating LLMs in capturing Human Preferences in Human-AI Tasks

Später ansehen

Favorit

Mudit Verma, …

ICML 2023 2 years ago

Differentiable Tree Operations Promote Compositional Generalization

05:13

Differentiable Tree Operations Promote Compositional Generalization

Später ansehen

Favorit

Paul Soulos, …

ICML 2023 2 years ago

Pix2Struct: Screenshot Parsing as Pretraining for Visual Language Understanding

07:55

Pix2Struct: Screenshot Parsing as Pretraining for Visual Language Understanding

Später ansehen

Favorit

Kenton Lee, …

ICML 2023 2 years ago

Partial Optimality in Cubic Correlation Clustering

04:34

Partial Optimality in Cubic Correlation Clustering

Später ansehen

Favorit

David Stein, …

ICML 2023 2 years ago

Welcome to the Indigenous in AI Workshop

10:52

Welcome to the Indigenous in AI Workshop

Später ansehen

Favorit

ʻŌiwi Parker Jones

ICML 2023 2 years ago

The Domain Generalization Issue in Data-Based Dynamical Models

32:46

The Domain Generalization Issue in Data-Based Dynamical Models

Später ansehen

Favorit

Patrick Gallinari

ICML 2023 2 years ago