Mudit Gaur, Vaneet Aggarwal, Mridul Aggarwal · On The Global Convergence of Reinforcement Learning Algorithms With Neural Network Parametrization · SlidesLive

Kategorie

CS

Přihlásit se Kontaktujte nás

Další

Živý přenos začne již brzy!

Živý přenos již skončil.

Prezentace ještě nebyla nahrána!

SlidesLive

title: On The Global Convergence of Reinforcement Learning Algorithms With Neural Network Parametrization

0:00 / 0:00

Nahlásit chybu
Nastavení
Playlisty
Záložky
Titulky Off
Rychlost přehrávání
Kvalita

Nastavení
Debug informace
Server sl-yoda-v2-stream-010-alpha.b-cdn.net
Velikost titulků Střední

Záložky

Server
sl-yoda-v2-stream-010-alpha.b-cdn.net
sl-yoda-v2-stream-010-beta.b-cdn.net
1759419103.rsc.cdn77.org
1016618226.rsc.cdn77.org

Titulky
Off
English

Rychlost přehrávání

Kvalita

Velikost titulků
Velké
Střední
Malé

Mode
Video Slideshow
Audio Slideshow
Slideshow
Video

On The Global Convergence of Reinforcement Learning Algorithms With Neural Network Parametrization

On The Global Convergence of Reinforcement Learning Algorithms With Neural Network Parametrization

24. července 2023

Řečníci

Mudit Gaur

Řečník · 0 sledujících

Vaneet Aggarwal

Řečník · 0 sledujících

Mridul Aggarwal

Řečník · 0 sledujících

O prezentaci

Deep Q-learning based algorithms have been applied successfully in many decision making problems, while their theoretical foundations are not as well understood. In this paper, we study a Fitted Q-Iteration with two-layer ReLU neural network parameterization, and find the sample complexity guarantees for the algorithm. Our approach estimates the Q-function in each iteration using a convex optimization problem. We show that this approach achieves a sample complexity of 𝒪̃(1/ϵ^2), which is order-…

Organizátor

ICML 2023

Účet · 657 sledujících

Baví vás formát? Nechte SlidesLive zachytit svou akci!

Profesionální natáčení a streamování po celém světě.

Sdílení

Doporučená videa

Prezentace na podobné téma, kategorii nebo přednášejícího

One-vs-the-Rest Loss to Focus on Important Samples in Adversarial Training

04:37

One-vs-the-Rest Loss to Focus on Important Samples in Adversarial Training

Zhlédnout později

Oblíbené

Sekitoshi Kanai, …

ICML 2023 2 years ago

Active Ranking of Experts Based on their Performances in Many Tasks

09:01

Active Ranking of Experts Based on their Performances in Many Tasks

Zhlédnout později

Oblíbené

El Mehdi Saad, …

ICML 2023 2 years ago

Pre-training for Speech Translation: CTC Meets Optimal Transport

05:21

Pre-training for Speech Translation: CTC Meets Optimal Transport

Zhlédnout později

Oblíbené

Phuong-Hang Le, …

ICML 2023 2 years ago

Adaptive IMLE for Few-shot Pretraining-free Generative Modelling

05:20

Adaptive IMLE for Few-shot Pretraining-free Generative Modelling

Zhlédnout později

Oblíbené

Mehran Aghabozorgi, …

ICML 2023 2 years ago

Training Large Language Models on Cerebras Wafer Scale Clusters

25:52

Training Large Language Models on Cerebras Wafer Scale Clusters

Zhlédnout později

Oblíbené

Natalia Vassilieva

ICML 2023 2 years ago

Coarse-to-Fine: a Hierarchical Diffusion Model for Molecule Generation in 3D

04:59

Coarse-to-Fine: a Hierarchical Diffusion Model for Molecule Generation in 3D

Zhlédnout později

Oblíbené

ICML 2023 2 years ago