Jan Robine, Marc Höftmann, Tobias Uelwer, Stefan Harmeling · Transformer-based World Models Are Happy With 100k Interactions · SlidesLive

Categories

EN

Log in Talk to sales

Next

Livestream will start soon!

Livestream has already ended.

Presentation has not been recorded yet!

SlidesLive

title: Transformer-based World Models Are Happy With 100k Interactions

0:00 / 0:00

Report Issue
Settings
Playlists
Bookmarks
Subtitles Off
Playback rate
Quality

Settings
Debug information
Server sl-yoda-v2-stream-001-alpha.b-cdn.net
Subtitles size Medium

Bookmarks

Server
sl-yoda-v2-stream-001-alpha.b-cdn.net
sl-yoda-v2-stream-001-beta.b-cdn.net
1824830694.rsc.cdn77.org
1979322955.rsc.cdn77.org

Subtitles
Off
English

Playback rate

Quality

Subtitles size
Large
Medium
Small

Mode
Video Slideshow
Audio Slideshow
Slideshow
Video

Transformer-based World Models Are Happy With 100k Interactions

Transformer-based World Models Are Happy With 100k Interactions

Dec 2, 2022

Speakers

Jan Robine

Speaker · 0 followers

Marc Höftmann

Speaker · 0 followers

Tobias Uelwer

Speaker · 0 followers

About

Deep neural networks have been successful in many reinforcement learning settings. However, compared to human learners they are overly data hungry. To build a sample-efficient world model, we apply a transformer to real-world episodes in an autoregressive manner: not only the compact latent states and the taken actions but also the experienced or predicted rewards are fed into the transformer, so that it can attend flexibly to all three modalities at different time steps. The transformer allows…

Organizer

NeurIPS 2022

Account · 961 followers

Like the format? Trust SlidesLive to capture your next event!

Professional recording and live streaming, delivered globally.

Sharing

Recommended Videos

Presentations on similar topic, category or speaker

Parameter-Efficient Low-Resource Dialogue State Tracking by Prompt Tuning

04:51

Parameter-Efficient Low-Resource Dialogue State Tracking by Prompt Tuning

Watch later

Favorite

NeurIPS 2022 2 years ago

Toward Causal-Aware RL: State-Wise Action-Refined Temporal Difference

02:04

Toward Causal-Aware RL: State-Wise Action-Refined Temporal Difference

Watch later

Favorite

NeurIPS 2022 2 years ago

Introduction to Algorithmic Fairness

53:14

Introduction to Algorithmic Fairness

Watch later

Favorite

Golnoosh Farnadi

NeurIPS 2022 2 years ago

Few-Shot Calibration of Set Predictors via Meta-Learned Cross-Validation-Based Conformal Prediction

05:00

Few-Shot Calibration of Set Predictors via Meta-Learned Cross-Validation-Based Conformal Prediction

Watch later

Favorite

Sangwoo Park, …

NeurIPS 2022 2 years ago

Near-Optimal Regret for Adversarial MDP with Delayed Bandit Feedback

04:44

Near-Optimal Regret for Adversarial MDP with Delayed Bandit Feedback

Watch later

Favorite

Tiancheng Jin, …

NeurIPS 2022 2 years ago

Variational Context Adjustment for Temporal Event Prediction under Distribution Shifts

04:15

Variational Context Adjustment for Temporal Event Prediction under Distribution Shifts

Watch later

Favorite

Chenxiao Yang, …

NeurIPS 2022 2 years ago