Next
Livestream will start soon!
Livestream has already ended.
Presentation has not been recorded yet!
  • title: CASA: Bridging the Gap between Policy Improvement and Policy Evaluation with Conflict Averse Policy Iteration
      0:00 / 0:00
      • Report Issue
      • Settings
      • Playlists
      • Bookmarks
      • Subtitles Off
      • Playback rate
      • Quality
      • Settings
      • Debug information
      • Server sl-yoda-v2-stream-005-alpha.b-cdn.net
      • Subtitles size Medium
      • Bookmarks
      • Server
      • sl-yoda-v2-stream-005-alpha.b-cdn.net
      • sl-yoda-v2-stream-005-beta.b-cdn.net
      • 1034628162.rsc.cdn77.org
      • 1409346856.rsc.cdn77.org
      • Subtitles
      • Off
      • English
      • Playback rate
      • Quality
      • Subtitles size
      • Large
      • Medium
      • Small
      • Mode
      • Video Slideshow
      • Audio Slideshow
      • Slideshow
      • Video
      My playlists
        Bookmarks
          00:00:00
            CASA: Bridging the Gap between Policy Improvement and Policy Evaluation with Conflict Averse Policy Iteration
            • Settings
            • Sync diff
            • Quality
            • Settings
            • Server
            • Quality
            • Server

            CASA: Bridging the Gap between Policy Improvement and Policy Evaluation with Conflict Averse Policy Iteration

            Dec 2, 2022

            Speakers

            CX

            Changnan Xiao

            Sprecher:in · 0 Follower:innen

            HS

            Haosen Shi

            Sprecher:in · 0 Follower:innen

            JF

            Jiajun Fan

            Sprecher:in · 0 Follower:innen

            About

            We study the problem of model-free reinforcement learning, which is often solved following the principle of Generalized Policy Iteration (GPI). While GPI is typically an interplay between policy evaluation and policy improvement, most conventional model-free methods with function approximation assume the independence of GPI steps, despite of the inherent connections between them. In this paper, we present a method that attempts to eliminate the inconsistency between policy evaluation step and po…

            Organizer

            N2
            N2

            NeurIPS 2022

            Konto · 961 Follower:innen

            Like the format? Trust SlidesLive to capture your next event!

            Professional recording and live streaming, delivered globally.

            Sharing

            Recommended Videos

            Presentations on similar topic, category or speaker

            Learn to Match with No Regret: Reinforcement Learning in Markov Matching Markets
            04:13

            Learn to Match with No Regret: Reinforcement Learning in Markov Matching Markets

            Yifei Min, …

            N2
            N2
            NeurIPS 2022 2 years ago

            Ewigspeicher-Fortschrittswert: 0 = 0.0%

            CalFAT: Calibrated Federated Adversarial Training with Label Skewness
            04:57

            CalFAT: Calibrated Federated Adversarial Training with Label Skewness

            Chen Chen, …

            N2
            N2
            NeurIPS 2022 2 years ago

            Ewigspeicher-Fortschrittswert: 0 = 0.0%

            Keynote 5: Building human-centric AI systems: thoughts on user agency, transparency and trust
            21:09

            Keynote 5: Building human-centric AI systems: thoughts on user agency, transparency and trust

            Fernanda Viégas

            N2
            N2
            NeurIPS 2022 2 years ago

            Ewigspeicher-Fortschrittswert: 0 = 0.0%

            DetCLIP: Dictionary-Enriched Visual-Concept Paralleled Pre-training for Open-world Detection
            04:49

            DetCLIP: Dictionary-Enriched Visual-Concept Paralleled Pre-training for Open-world Detection

            Lewei Yao, …

            N2
            N2
            NeurIPS 2022 2 years ago

            Ewigspeicher-Fortschrittswert: 0 = 0.0%

            VRL3: A Data-Driven Framework for Visual Deep Reinforcement Learning
            04:54

            VRL3: A Data-Driven Framework for Visual Deep Reinforcement Learning

            Che Wang, …

            N2
            N2
            NeurIPS 2022 2 years ago

            Ewigspeicher-Fortschrittswert: 0 = 0.0%

            Self-supervised learning of brain dynamics from broad neuroimaging data
            05:04

            Self-supervised learning of brain dynamics from broad neuroimaging data

            Armin W. Thomas, …

            N2
            N2
            NeurIPS 2022 2 years ago

            Ewigspeicher-Fortschrittswert: 0 = 0.0%

            Interested in talks like this? Follow NeurIPS 2022