Zechu Li, Tao Chen, Zhang-Wei Hong, Anurag Ajay, Pulkit Agrawal · Parallel Q-Learning: a Scheme for Time-efficient Reinforcement Learning · SlidesLive

Kategorie

CS

Přihlásit se Kontaktujte nás

Další

Živý přenos začne již brzy!

Živý přenos již skončil.

Prezentace ještě nebyla nahrána!

SlidesLive

title: Parallel Q-Learning: a Scheme for Time-efficient Reinforcement Learning

0:00 / 0:00

Nahlásit chybu
Nastavení
Playlisty
Záložky
Titulky Off
Rychlost přehrávání
Kvalita

Nastavení
Debug informace
Server sl-yoda-v2-stream-002-alpha.b-cdn.net
Velikost titulků Střední

Záložky

Server
sl-yoda-v2-stream-002-alpha.b-cdn.net
sl-yoda-v2-stream-002-beta.b-cdn.net
1001562353.rsc.cdn77.org
1075090661.rsc.cdn77.org

Titulky
Off
English

Rychlost přehrávání

Kvalita

Velikost titulků
Velké
Střední
Malé

Mode
Video Slideshow
Audio Slideshow
Slideshow
Video

Parallel Q-Learning: a Scheme for Time-efficient Reinforcement Learning

Parallel Q-Learning: a Scheme for Time-efficient Reinforcement Learning

24. července 2023

Řečníci

Zechu Li

Speaker · 0 followers

Tao Chen

Speaker · 0 followers

Zhang-Wei Hong

Speaker · 0 followers

O prezentaci

Reinforcement learning algorithms require a long time to learn policies on complex tasks due to the need for a large amount of training data. With the recent advances in GPU-based simulation, such as Isaac Gym, data collection has been sped up thousands of times on a commodity GPU. Most prior works have used on-policy methods such as PPO to train policies due to their simplicity and easy-to-scale nature. Off-policy methods are usually more sample-efficient but more challenging to be scaled up, r…

Organizátor

ICML 2023

Account · 657 followers

Baví vás formát? Nechte SlidesLive zachytit svou akci!

Profesionální natáčení a streamování po celém světě.

Sdílení

Doporučená videa

Prezentace na podobné téma, kategorii nebo přednášejícího

Stratified Adversarial Robustness with Rejection

05:17

Stratified Adversarial Robustness with Rejection

Watch later

Favorite

Jiefeng Chen, …

ICML 2023 2 years ago

Evaluating the impact of incorporating ’legalese’ definitions and abstractive summarization on the categorization of legal cases

12:23

Evaluating the impact of incorporating ’legalese’ definitions and abstractive summarization on the categorization of legal cases

Watch later

Favorite

Daniela Cortes Bermudez, …

ICML 2023 2 years ago

Grammar-Induced Geometry for Data-Efficient Molecular Property Prediction

05:20

Grammar-Induced Geometry for Data-Efficient Molecular Property Prediction

Watch later

Favorite

Minghao Guo, …

ICML 2023 2 years ago

Invariance in Policy Optimisation and Partial Identifiability in Reward Learning

05:21

Invariance in Policy Optimisation and Partial Identifiability in Reward Learning

Watch later

Favorite

Joar Skalse, …

ICML 2023 2 years ago

ACAT: Adversarial Counterfactual Attention for Classification and Detection in Medical Imaging

05:44

ACAT: Adversarial Counterfactual Attention for Classification and Detection in Medical Imaging

Watch later

Favorite

Alessandro Fontanella, …

ICML 2023 2 years ago

Variational Sparse Inverse Cholesky Approximation for Latent Gaussian Processes via Double Kullback-Leibler Minimization

05:15

Variational Sparse Inverse Cholesky Approximation for Latent Gaussian Processes via Double Kullback-Leibler Minimization

Watch later

Favorite

ICML 2023 2 years ago