Ruijie Zheng, Xiyao Wang, Huazhe Xu, Furong Huang · Is Model Ensemble Necessary? Model-based RL via a Single Model with Lipschitz Regularized Value Function · SlidesLive

Kategorien

DE

Anmelden Vertrieb kontaktieren

Next

Livestream will start soon!

Livestream has already ended.

Presentation has not been recorded yet!

SlidesLive

title: Is Model Ensemble Necessary? Model-based RL via a Single Model with Lipschitz Regularized Value Function

0:00 / 0:00

Report Issue
Settings
Playlists
Bookmarks
Subtitles Off
Playback rate
Quality

Settings
Debug information
Server sl-yoda-v2-stream-005-alpha.b-cdn.net
Subtitles size Medium

Bookmarks

Server
sl-yoda-v2-stream-005-alpha.b-cdn.net
sl-yoda-v2-stream-005-beta.b-cdn.net
1034628162.rsc.cdn77.org
1409346856.rsc.cdn77.org

Subtitles
Off
English

Playback rate

Quality

Subtitles size
Large
Medium
Small

Mode
Video Slideshow
Audio Slideshow
Slideshow
Video

Is Model Ensemble Necessary? Model-based RL via a Single Model with Lipschitz Regularized Value Function

Is Model Ensemble Necessary? Model-based RL via a Single Model with Lipschitz Regularized Value Function

Dez 2, 2022

Sprecher:innen

Ruijie Zheng

Sprecher:in · 0 Follower:innen

Xiyao Wang

Sprecher:in · 0 Follower:innen

Huazhe Xu

Sprecher:in · 0 Follower:innen

Über

Probabilistic dynamics model ensemble is widely used in existing model-based reinforcement learning methods as it outperforms a single dynamics model in both asymptotic performance and sample efficiency. In this paper, we provide both practical and theoretical insights on the empirical success of the probabilistic dynamics model ensemble through the lens of Lipschitz continuity. We find that, for a value function, the stronger the Lipschitz condition is, the smaller the gap between the true dyna…

Organisator

NeurIPS 2022

Konto · 961 Follower:innen

Gefällt euch das Format? Vertraut auf SlidesLive, um euer nächstes Event festzuhalten!

Professionelle Aufzeichnung und Livestreaming – weltweit.

Freigeben

Empfohlene Videos

Präsentationen, deren Thema, Kategorie oder Sprecher:in ähnlich sind

InsNet: An Efficient, Flexible, and Performant Insertion-based Text Generation Model

04:28

InsNet: An Efficient, Flexible, and Performant Insertion-based Text Generation Model

Später ansehen

Favorit

NeurIPS 2022 2 years ago

Matching in Multi-arm Bandit with Collision

04:06

Matching in Multi-arm Bandit with Collision

Später ansehen

Favorit

Yirui Zhang, …

NeurIPS 2022 2 years ago

Bridging the Gap Between Coulomb GAN and Gradient-regularized WGAN

13:07

Bridging the Gap Between Coulomb GAN and Gradient-regularized WGAN

Später ansehen

Favorit

Siddarth Asokan, …

NeurIPS 2022 2 years ago

GenSDF: Two-Stage Learning of Generalizable Signed Distance Functions

05:00

GenSDF: Two-Stage Learning of Generalizable Signed Distance Functions

Später ansehen

Favorit

NeurIPS 2022 2 years ago

Online PAC-Bayes Learning

05:01

Online PAC-Bayes Learning

Später ansehen

Favorit

Maxime Haddouche, …

NeurIPS 2022 2 years ago

Adaptive Oracle-Efficient Online Learning

04:39

Adaptive Oracle-Efficient Online Learning

Später ansehen

Favorit

Guanghui Wang, …

NeurIPS 2022 2 years ago