Joan Bas Serrano, Sebastian Curi, Andreas Krause, Gergely Neu · Oral: Logistic Q-Learning · SlidesLive

Categories

EN

Log in Talk to sales

Next

Livestream will start soon!

Livestream has already ended.

Presentation has not been recorded yet!

SlidesLive

title: Oral: Logistic Q-Learning

0:00 / 0:00

Report Issue
Settings
Playlists
Bookmarks
Subtitles Off
Playback rate
Quality

Settings
Debug information
Server sl-yoda-v3-stream-011-alpha.b-cdn.net
Subtitles size Medium

Bookmarks

Server
sl-yoda-v3-stream-011-alpha.b-cdn.net
sl-yoda-v3-stream-011-beta.b-cdn.net
1150868944.rsc.cdn77.org
1511650057.rsc.cdn77.org

Subtitles
Off
English

Playback rate

Quality

Subtitles size
Large
Medium
Small

Mode
Video Slideshow
Audio Slideshow
Slideshow
Video

Oral: Logistic Q-Learning

Oral: Logistic Q-Learning

Apr 14, 2021

Speakers

Joan Bas Serrano

Speaker · 0 followers

Sebastian Curi

Speaker · 0 followers

Andreas Krause

Speaker · 6 followers

About

We propose a new reinforcement learning algorithm derived from a regularized linear-programming formulation of optimal control in MDPs. The method is closely related to the classic Relative Entropy Policy Search (REPS) algorithm of Peters et al. (2010), with the key difference that our method introduces a Q-function that enables efficient exact model-free implementation. The main feature of our algorithm (called Q-REPS) is a convex loss function for policy evaluation that serves as a theoretical…

Organizer

AISTATS 2021

Account · 63 followers

Categories

Mathematics

Category · 2.4k presentations

AI & Data Science

Category · 10.8k presentations

About AISTATS 2021

The 24th International Conference on Artificial Intelligence and Statistics was held virtually from Tuesday, 13 April 2021 to Thursday, 15 April 2021.

Like the format? Trust SlidesLive to capture your next event!

Professional recording and live streaming, delivered globally.

Sharing

Recommended Videos

Presentations on similar topic, category or speaker

Animal pose estimation from video data with a hierarchical von Mises-Fisher-Gaussian model

03:23

Animal pose estimation from video data with a hierarchical von Mises-Fisher-Gaussian model

Watch later

Favorite

Libby Zhang, …

AISTATS 2021 4 years ago

On the Suboptimality of Negative Momentum for Minimax Optimization

03:11

On the Suboptimality of Negative Momentum for Minimax Optimization

Watch later

Favorite

Guodong Zhang, …

AISTATS 2021 4 years ago

Random Coordinate Underdamped Langevin Monte Carlo

03:01

Random Coordinate Underdamped Langevin Monte Carlo

Watch later

Favorite

Zhiyan Ding, …

AISTATS 2021 4 years ago

Prediction with Finitely many Errors Almost Surely

02:56

Prediction with Finitely many Errors Almost Surely

Watch later

Favorite

Changlong Wu, …

AISTATS 2021 4 years ago

Online Model Selection for Reinforcement Learning with Function Approximation

03:15

Online Model Selection for Reinforcement Learning with Function Approximation

Watch later

Favorite

Jonathan Lee, …

AISTATS 2021 4 years ago

Learning Temporal Point Processes with Intermittent Observations

02:54

Learning Temporal Point Processes with Intermittent Observations

Watch later

Favorite

Vinayak Gupta, …

AISTATS 2021 4 years ago