Yue Wang, Shaofeng Zou, Yi Zhou · Non-Asymptotic Analysis for Two Time-scale TDC with General Smooth Function Approximation · SlidesLive

Kategorien

DE

Anmelden Vertrieb kontaktieren

Next

Livestream will start soon!

Livestream has already ended.

Presentation has not been recorded yet!

SlidesLive

title: Non-Asymptotic Analysis for Two Time-scale TDC with General Smooth Function Approximation

0:00 / 0:00

Report Issue
Settings
Playlists
Bookmarks
Subtitles Off
Playback rate
Quality

Settings
Debug information
Server sl-yoda-v3-stream-013-alpha.b-cdn.net
Subtitles size Medium

Bookmarks

Server
sl-yoda-v3-stream-013-alpha.b-cdn.net
sl-yoda-v3-stream-013-beta.b-cdn.net
1668715672.rsc.cdn77.org
1420896597.rsc.cdn77.org

Subtitles
Off
English

Playback rate

Quality

Subtitles size
Large
Medium
Small

Mode
Video Slideshow
Audio Slideshow
Slideshow
Video

Non-Asymptotic Analysis for Two Time-scale TDC with General Smooth Function Approximation

Non-Asymptotic Analysis for Two Time-scale TDC with General Smooth Function Approximation

Dez 6, 2021

Sprecher:innen

Yue Wang

Speaker · 1 follower

Shaofeng Zou

Speaker · 0 followers

Yi Zhou

Speaker · 0 followers

Über

Temporal-difference learning with gradient correction (TDC) is a two time-scale algorithm for policy evaluation in reinforcement learning. This algorithm was initially proposed with linear function approximation, and was later extended to the one with general smooth function approximation. The asymptotic convergence for the on-policy setting with general smooth function approximation was established in [Bhatnagar et al., 2009], however, the non-asymptotic convergence analysis remains unsolved du…

Organisator

NeurIPS 2021

Account · 1.9k followers

Über NeurIPS 2021

Neural Information Processing Systems (NeurIPS) is a multi-track machine learning and computational neuroscience conference that includes invited talks, demonstrations, symposia and oral and poster presentations of refereed papers. Following the conference, there are workshops which provide a less formal setting.

Gefällt euch das Format? Vertraut auf SlidesLive, um euer nächstes Event festzuhalten!

Professionelle Aufzeichnung und Livestreaming – weltweit.

Freigeben

Empfohlene Videos

Präsentationen, deren Thema, Kategorie oder Sprecher:in ähnlich sind

Entropic estimation of optimal transport maps

16:00

Entropic estimation of optimal transport maps

Watch later

Favorite

Aram-Alexandre Pooladian, …

NeurIPS 2021 3 years ago

Reducing Information Bottleneck for Weakly Supervised Semantic Segmentation

11:20

Reducing Information Bottleneck for Weakly Supervised Semantic Segmentation

Watch later

Favorite

Jungbeom Lee, …

NeurIPS 2021 3 years ago

Moser Flow: Divergence-based Generative Modeling on Manifolds

17:51

Moser Flow: Divergence-based Generative Modeling on Manifolds

Watch later

Favorite

Noam Rozen, …

NeurIPS 2021 3 years ago

PatchGame: Learning to Signal Mid-level Patches in Referential Games

09:28

PatchGame: Learning to Signal Mid-level Patches in Referential Games

Watch later

Favorite

Kamal Gupta, …

NeurIPS 2021 3 years ago

Algorithmic Instabilities of Accelerated Gradient Descent

08:07

Algorithmic Instabilities of Accelerated Gradient Descent

Watch later

Favorite

Amit Attia, …

NeurIPS 2021 3 years ago

Communication-efficient SGD: From Local SGD to One-Shot Averaging

14:57

Communication-efficient SGD: From Local SGD to One-Shot Averaging

Watch later

Favorite

Artin Spiridonoff, …

NeurIPS 2021 3 years ago