Next
Livestream will start soon!
Livestream has already ended.
Presentation has not been recorded yet!
  • title: Non-Asymptotic Analysis for Two Time-scale TDC with General Smooth Function Approximation
      0:00 / 0:00
      • Report Issue
      • Settings
      • Playlists
      • Bookmarks
      • Subtitles Off
      • Playback rate
      • Quality
      • Settings
      • Debug information
      • Server sl-yoda-v3-stream-013-alpha.b-cdn.net
      • Subtitles size Medium
      • Bookmarks
      • Server
      • sl-yoda-v3-stream-013-alpha.b-cdn.net
      • sl-yoda-v3-stream-013-beta.b-cdn.net
      • 1668715672.rsc.cdn77.org
      • 1420896597.rsc.cdn77.org
      • Subtitles
      • Off
      • English
      • Playback rate
      • Quality
      • Subtitles size
      • Large
      • Medium
      • Small
      • Mode
      • Video Slideshow
      • Audio Slideshow
      • Slideshow
      • Video
      My playlists
        Bookmarks
          00:00:00
            Non-Asymptotic Analysis for Two Time-scale TDC with General Smooth Function Approximation
            • Settings
            • Sync diff
            • Quality
            • Settings
            • Server
            • Quality
            • Server

            Non-Asymptotic Analysis for Two Time-scale TDC with General Smooth Function Approximation

            Dez 6, 2021

            Sprecher:innen

            YW

            Yue Wang

            Speaker · 1 follower

            SZ

            Shaofeng Zou

            Speaker · 0 followers

            YZ

            Yi Zhou

            Speaker · 0 followers

            Über

            Temporal-difference learning with gradient correction (TDC) is a two time-scale algorithm for policy evaluation in reinforcement learning. This algorithm was initially proposed with linear function approximation, and was later extended to the one with general smooth function approximation. The asymptotic convergence for the on-policy setting with general smooth function approximation was established in [Bhatnagar et al., 2009], however, the non-asymptotic convergence analysis remains unsolved du…

            Organisator

            N2
            N2

            NeurIPS 2021

            Account · 1.9k followers

            Über NeurIPS 2021

            Neural Information Processing Systems (NeurIPS) is a multi-track machine learning and computational neuroscience conference that includes invited talks, demonstrations, symposia and oral and poster presentations of refereed papers. Following the conference, there are workshops which provide a less formal setting.

            Gefällt euch das Format? Vertraut auf SlidesLive, um euer nächstes Event festzuhalten!

            Professionelle Aufzeichnung und Livestreaming – weltweit.

            Freigeben

            Empfohlene Videos

            Präsentationen, deren Thema, Kategorie oder Sprecher:in ähnlich sind

            Entropic estimation of optimal transport maps
            16:00

            Entropic estimation of optimal transport maps

            Aram-Alexandre Pooladian, …

            N2
            N2
            NeurIPS 2021 3 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            Reducing Information Bottleneck for Weakly Supervised Semantic Segmentation
            11:20

            Reducing Information Bottleneck for Weakly Supervised Semantic Segmentation

            Jungbeom Lee, …

            N2
            N2
            NeurIPS 2021 3 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            Moser Flow: Divergence-based Generative Modeling on Manifolds
            17:51

            Moser Flow: Divergence-based Generative Modeling on Manifolds

            Noam Rozen, …

            N2
            N2
            NeurIPS 2021 3 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            PatchGame: Learning to Signal Mid-level Patches in Referential Games
            09:28

            PatchGame: Learning to Signal Mid-level Patches in Referential Games

            Kamal Gupta, …

            N2
            N2
            NeurIPS 2021 3 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            Algorithmic Instabilities of Accelerated Gradient Descent
            08:07

            Algorithmic Instabilities of Accelerated Gradient Descent

            Amit Attia, …

            N2
            N2
            NeurIPS 2021 3 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            Communication-efficient SGD: From Local SGD to One-Shot Averaging
            14:57

            Communication-efficient SGD: From Local SGD to One-Shot Averaging

            Artin Spiridonoff, …

            N2
            N2
            NeurIPS 2021 3 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            Interessiert an Vorträgen wie diesem? NeurIPS 2021 folgen