Runlong Zhou, Ruosong Wang, Simon Shaolei Du · Horizon-Free and Variance-Dependent Reinforcement Learning for Latent Markov Decision Processes · SlidesLive

Categories

EN

Log in Talk to sales

Next

Livestream will start soon!

Livestream has already ended.

Presentation has not been recorded yet!

SlidesLive

title: Horizon-Free and Variance-Dependent Reinforcement Learning for Latent Markov Decision Processes

0:00 / 0:00

Report Issue
Settings
Playlists
Bookmarks
Subtitles Off
Playback rate
Quality

Settings
Debug information
Server sl-yoda-v2-stream-005-alpha.b-cdn.net
Subtitles size Medium

Bookmarks

Server
sl-yoda-v2-stream-005-alpha.b-cdn.net
sl-yoda-v2-stream-005-beta.b-cdn.net
1034628162.rsc.cdn77.org
1409346856.rsc.cdn77.org

Subtitles
Off
English

Playback rate

Quality

Subtitles size
Large
Medium
Small

Mode
Video Slideshow
Audio Slideshow
Slideshow
Video

Horizon-Free and Variance-Dependent Reinforcement Learning for Latent Markov Decision Processes

Horizon-Free and Variance-Dependent Reinforcement Learning for Latent Markov Decision Processes

Jul 24, 2023

Speakers

Runlong Zhou

Speaker · 0 followers

Ruosong Wang

Speaker · 0 followers

Simon Shaolei Du

Speaker · 0 followers

About

We study regret minimization for reinforcement learning (RL) in Latent Markov Decision Processes (LMDPs) with context in hindsight. We design a novel model-based algorithmic framework which can be instantiated with both a model-optimistic and a value-optimistic solver. We prove an Õ(√(𝖵𝖺𝗋^⋆ M Γ S A K)) regret bound where Õ hides logarithm factors, M is the number of contexts, S is the number of states, A is the number of actions, K is the number of episodes, Γ< S is the maximum transition…

Organizer

ICML 2023

Account · 657 followers

Like the format? Trust SlidesLive to capture your next event!

Professional recording and live streaming, delivered globally.

Sharing

Recommended Videos

Presentations on similar topic, category or speaker

D2Match: Leveraging Deep Learning and Degeneracy for Subgraph Matching

05:04

D2Match: Leveraging Deep Learning and Degeneracy for Subgraph Matching

Watch later

Favorite

Xuanzhou Liu, …

ICML 2023 2 years ago

Sketching for First Order Method: Efficient Algorithm for Low-Bandwidth Channel and Vulnerability

05:02

Sketching for First Order Method: Efficient Algorithm for Low-Bandwidth Channel and Vulnerability

Watch later

Favorite

Yitan Wang, …

ICML 2023 2 years ago

Preprocessors Matter! Realistic Decision-Based Attacks on Machine Learning Systems

05:18

Preprocessors Matter! Realistic Decision-Based Attacks on Machine Learning Systems

Watch later

Favorite

Chawin Sitawarin, …

ICML 2023 2 years ago

Uncertain Evidence in Probabilistic Models and Stochastic Simulators

06:05

Uncertain Evidence in Probabilistic Models and Stochastic Simulators

Watch later

Favorite

Andreas Munk, …

ICML 2023 2 years ago

Structure in Monge Maps by Engineering Costs

47:56

Structure in Monge Maps by Engineering Costs

Watch later

Favorite

ICML 2023 2 years ago

The Benefits of Model-Based Generalization in Reinforcement Learning

04:19

The Benefits of Model-Based Generalization in Reinforcement Learning

Watch later

Favorite

Kenny Young, …

ICML 2023 2 years ago