Next
Livestream will start soon!
Livestream has already ended.
Presentation has not been recorded yet!
  • title: FlowPG: Action-constrained Policy Gradient with Normalizing Flows
      0:00 / 0:00
      • Report Issue
      • Settings
      • Playlists
      • Bookmarks
      • Subtitles Off
      • Playback rate
      • Quality
      • Settings
      • Debug information
      • Server sl-yoda-v2-stream-001-alpha.b-cdn.net
      • Subtitles size Medium
      • Bookmarks
      • Server
      • sl-yoda-v2-stream-001-alpha.b-cdn.net
      • sl-yoda-v2-stream-001-beta.b-cdn.net
      • 1824830694.rsc.cdn77.org
      • 1979322955.rsc.cdn77.org
      • Subtitles
      • Off
      • English
      • Playback rate
      • Quality
      • Subtitles size
      • Large
      • Medium
      • Small
      • Mode
      • Video Slideshow
      • Audio Slideshow
      • Slideshow
      • Video
      My playlists
        Bookmarks
          00:00:00
            FlowPG: Action-constrained Policy Gradient with Normalizing Flows
            • Settings
            • Sync diff
            • Quality
            • Settings
            • Server
            • Quality
            • Server

            FlowPG: Action-constrained Policy Gradient with Normalizing Flows

            Dec 10, 2023

            Speakers

            JCB

            Janaka Chathuranga Brahmanage

            Sprecher:in · 0 Follower:innen

            JL

            Jiajing Ling

            Sprecher:in · 0 Follower:innen

            AK

            Akshat Kumar

            Sprecher:in · 0 Follower:innen

            About

            Action-constrained reinforcement learning (ACRL) is a popular approach for solving safety-critical and resource-allocation related decision making problems. However, one of the major challenges in solving ACRL is to find valid actions that satisfy the constraints in each RL step. While adding a projection layer on top of the original policy network is a commonly used approach, it involves solving a mathematical program, either during training or in action execution, or both, which can result in…

            Organizer

            N2
            N2

            NeurIPS 2023

            Konto · 645 Follower:innen

            Like the format? Trust SlidesLive to capture your next event!

            Professional recording and live streaming, delivered globally.

            Sharing

            Recommended Videos

            Presentations on similar topic, category or speaker

            Diffusion Model is an Effective Planner and Data Synthesizer for Multi-Task Reinforcement Learning
            04:12

            Diffusion Model is an Effective Planner and Data Synthesizer for Multi-Task Reinforcement Learning

            Haoran He, …

            N2
            N2
            NeurIPS 2023 16 months ago

            Ewigspeicher-Fortschrittswert: 0 = 0.0%

            Improving the Privacy and Practicality of Objective Perturbation for Differentially Private Linear Learners
            04:41

            Improving the Privacy and Practicality of Objective Perturbation for Differentially Private Linear Learners

            Rachel Redberg, …

            N2
            N2
            NeurIPS 2023 16 months ago

            Ewigspeicher-Fortschrittswert: 0 = 0.0%

            Temporal Conditioning Spiking Latent Variable Models of the Neural Response to Natural Visual Scenes
            03:56

            Temporal Conditioning Spiking Latent Variable Models of the Neural Response to Natural Visual Scenes

            Marcus (Gehua) Ma, …

            N2
            N2
            NeurIPS 2023 16 months ago

            Ewigspeicher-Fortschrittswert: 0 = 0.0%

            Conditional independence testing under misspecified inductive biases
            04:56

            Conditional independence testing under misspecified inductive biases

            Felipe Maia Polo, …

            N2
            N2
            NeurIPS 2023 16 months ago

            Ewigspeicher-Fortschrittswert: 0 = 0.0%

            The Harvard USPTO Patent Dataset: A Large-Scale, Well-Structured, and Multi-Purpose Corpus of Patent Applications
            03:44

            The Harvard USPTO Patent Dataset: A Large-Scale, Well-Structured, and Multi-Purpose Corpus of Patent Applications

            Mirac Suzgun, …

            N2
            N2
            NeurIPS 2023 16 months ago

            Ewigspeicher-Fortschrittswert: 0 = 0.0%

            Faith and Fate: Limits of Transformers on Compositionality
            04:59

            Faith and Fate: Limits of Transformers on Compositionality

            Nouha Dziri, …

            N2
            N2
            NeurIPS 2023 16 months ago

            Ewigspeicher-Fortschrittswert: 0 = 0.0%

            Interested in talks like this? Follow NeurIPS 2023