Next
Livestream will start soon!
Livestream has already ended.
Presentation has not been recorded yet!
  • title: GradientDICE: Rethinking Generalized Offline Estimation of Stationary Values
      0:00 / 0:00
      • Report Issue
      • Settings
      • Playlists
      • Bookmarks
      • Subtitles Off
      • Playback rate
      • Quality
      • Settings
      • Debug information
      • Server sl-yoda-v3-stream-011-alpha.b-cdn.net
      • Subtitles size Medium
      • Bookmarks
      • Server
      • sl-yoda-v3-stream-011-alpha.b-cdn.net
      • sl-yoda-v3-stream-011-beta.b-cdn.net
      • 1150868944.rsc.cdn77.org
      • 1511650057.rsc.cdn77.org
      • Subtitles
      • Off
      • en
      • Playback rate
      • Quality
      • Subtitles size
      • Large
      • Medium
      • Small
      • Mode
      • Video Slideshow
      • Audio Slideshow
      • Slideshow
      • Video
      My playlists
        Bookmarks
          00:00:00
            GradientDICE: Rethinking Generalized Offline Estimation of Stationary Values
            • Settings
            • Sync diff
            • Quality
            • Settings
            • Server
            • Quality
            • Server

            GradientDICE: Rethinking Generalized Offline Estimation of Stationary Values

            Jul 12, 2020

            Speakers

            SZ

            Shangtong Zhang

            Speaker · 0 followers

            BL
            BL

            Bo Liu

            Speaker · 1 follower

            SW

            Shimon Whiteson

            Speaker · 7 followers

            About

            We present GradientDICE for estimating the density ratio between the state distribution of the target policy and the sampling distribution in off-policy reinforcement learning. GradientDICE fixes several problems of GenDICE (Zhang et al., 2020), the current state-of-the-art for estimating such density ratios. Namely, the optimization problem in GenDICE is not a convex-concave saddle-point problem once nonlinearity in optimization variable parameterization is introduced to ensure positivity, so p…

            Organizer

            I2
            I2

            ICML 2020

            Account · 2.7k followers

            Categories

            Mathematics

            Category · 2.4k presentations

            AI & Data Science

            Category · 10.8k presentations

            About ICML 2020

            The International Conference on Machine Learning (ICML) is the premier gathering of professionals dedicated to the advancement of the branch of artificial intelligence known as machine learning. ICML is globally renowned for presenting and publishing cutting-edge research on all aspects of machine learning used in closely related areas like artificial intelligence, statistics and data science, as well as important application areas such as machine vision, computational biology, speech recognition, and robotics. ICML is one of the fastest growing artificial intelligence conferences in the world. Participants at ICML span a wide range of backgrounds, from academic and industrial researchers, to entrepreneurs and engineers, to graduate students and postdocs.

            Like the format? Trust SlidesLive to capture your next event!

            Professional recording and live streaming, delivered globally.

            Sharing

            Recommended Videos

            Presentations on similar topic, category or speaker

            Semismooth Newton Algorithm for Efficient Projections onto  ℓ 1 , ∞ -norm Ball
            13:54

            Semismooth Newton Algorithm for Efficient Projections onto ℓ 1 , ∞ -norm Ball

            Dejun Chu, …

            I2
            I2
            ICML 2020 5 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            Towards Causal Reinforcement Learning - Part II
            1:54:55

            Towards Causal Reinforcement Learning - Part II

            Elias Bareinboim

            I2
            I2
            ICML 2020 5 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            Goodness-of-Fit Tests for Inhomogeneous Random Graphs
            12:05

            Goodness-of-Fit Tests for Inhomogeneous Random Graphs

            Soham Dan, …

            I2
            I2
            ICML 2020 5 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            Optimal Robust Learning of Discrete Distributions from Batches
            16:15

            Optimal Robust Learning of Discrete Distributions from Batches

            Ayush Jain, …

            I2
            I2
            ICML 2020 5 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            Visualizing Classification Structure in Large-Scale Classifiers
            05:12

            Visualizing Classification Structure in Large-Scale Classifiers

            Bilal Alsallakh, …

            I2
            I2
            ICML 2020 5 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            An Off-policy Policy Gradient Theorem A Tale About Weightings
            08:49

            An Off-policy Policy Gradient Theorem A Tale About Weightings

            Martha White

            I2
            I2
            ICML 2020 5 years ago

            Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%

            Interested in talks like this? Follow ICML 2020