Next
Livestream will start soon!
Livestream has already ended.
Presentation has not been recorded yet!
  • title: DoMo-AC: Doubly Multi-step Off-policy Actor-Critic Algorithm for RL
      0:00 / 0:00
      • Report Issue
      • Settings
      • Playlists
      • Bookmarks
      • Subtitles Off
      • Playback rate
      • Quality
      • Settings
      • Debug information
      • Server sl-yoda-v2-stream-007-alpha.b-cdn.net
      • Subtitles size Medium
      • Bookmarks
      • Server
      • sl-yoda-v2-stream-007-alpha.b-cdn.net
      • sl-yoda-v2-stream-007-beta.b-cdn.net
      • 1678031076.rsc.cdn77.org
      • 1932936657.rsc.cdn77.org
      • Subtitles
      • Off
      • English
      • Playback rate
      • Quality
      • Subtitles size
      • Large
      • Medium
      • Small
      • Mode
      • Video Slideshow
      • Audio Slideshow
      • Slideshow
      • Video
      My playlists
        Bookmarks
          00:00:00
            DoMo-AC: Doubly Multi-step Off-policy Actor-Critic Algorithm for RL
            • Settings
            • Sync diff
            • Quality
            • Settings
            • Server
            • Quality
            • Server

            DoMo-AC: Doubly Multi-step Off-policy Actor-Critic Algorithm for RL

            Jul 24, 2023

            Sprecher:innen

            YT

            Yunhao Tang

            Sprecher:in · 0 Follower:innen

            TK

            Tadashi Kozuno

            Sprecher:in · 0 Follower:innen

            MR

            Mark Rowland

            Sprecher:in · 0 Follower:innen

            Über

            Multi-step learning applies lookahead over multiple time steps and has proved valuable in policy evaluation settings. However, in the optimal control case, the impact of multi-step learning has been relatively limited despite a number of prior efforts. Fundamentally, this might be because multi-step policy improvements require operations that cannot be approximated by stochastic samples, hence hindering the widespread adoption of such methods in practice. To address such limitations, we introduc…

            Organisator

            I2
            I2

            ICML 2023

            Konto · 657 Follower:innen

            Gefällt euch das Format? Vertraut auf SlidesLive, um euer nächstes Event festzuhalten!

            Professionelle Aufzeichnung und Livestreaming – weltweit.

            Freigeben

            Empfohlene Videos

            Präsentationen, deren Thema, Kategorie oder Sprecher:in ähnlich sind

            Opening Remarks
            02:04

            Opening Remarks

            Courtney Paquette

            I2
            I2
            ICML 2023 2 years ago

            Ewigspeicher-Fortschrittswert: 0 = 0.0%

            Beyond Homophily: Reconstructing Structure for Graph-agnostic Clustering
            04:37

            Beyond Homophily: Reconstructing Structure for Graph-agnostic Clustering

            Erlin Pan, …

            I2
            I2
            ICML 2023 2 years ago

            Ewigspeicher-Fortschrittswert: 0 = 0.0%

            ED-Batch: Efficient Automatic Batching of  Dynamic Neural Networks via Learned Finite State Machines
            05:31

            ED-Batch: Efficient Automatic Batching of Dynamic Neural Networks via Learned Finite State Machines

            Siyuan Chen, …

            I2
            I2
            ICML 2023 2 years ago

            Ewigspeicher-Fortschrittswert: 0 = 0.0%

            Adversarial Cheap Talk
            05:25

            Adversarial Cheap Talk

            Chris Lu, …

            I2
            I2
            ICML 2023 2 years ago

            Ewigspeicher-Fortschrittswert: 0 = 0.0%

            BEATs: Audio Pre-Training with Acoustic Tokenizers
            04:57

            BEATs: Audio Pre-Training with Acoustic Tokenizers

            Sanyuan Chen, …

            I2
            I2
            ICML 2023 2 years ago

            Ewigspeicher-Fortschrittswert: 0 = 0.0%

            Efficient Exploration via Epistemic-Risk-Seeking Policy Optimization
            05:33

            Efficient Exploration via Epistemic-Risk-Seeking Policy Optimization

            Brendan O'Donoghue

            I2
            I2
            ICML 2023 2 years ago

            Ewigspeicher-Fortschrittswert: 0 = 0.0%

            Interessiert an Vorträgen wie diesem? ICML 2023 folgen