Next
Livestream will start soon!
Livestream has already ended.
Presentation has not been recorded yet!
  • title: Enforcing Hard Constraints with Soft Barriers: Safe Reinforcement Learning in Unknown Stochastic Environments
      0:00 / 0:00
      • Report Issue
      • Settings
      • Playlists
      • Bookmarks
      • Subtitles Off
      • Playback rate
      • Quality
      • Settings
      • Debug information
      • Server sl-yoda-v2-stream-008-alpha.b-cdn.net
      • Subtitles size Medium
      • Bookmarks
      • Server
      • sl-yoda-v2-stream-008-alpha.b-cdn.net
      • sl-yoda-v2-stream-008-beta.b-cdn.net
      • 1159783934.rsc.cdn77.org
      • 1511376917.rsc.cdn77.org
      • Subtitles
      • Off
      • English
      • Playback rate
      • Quality
      • Subtitles size
      • Large
      • Medium
      • Small
      • Mode
      • Video Slideshow
      • Audio Slideshow
      • Slideshow
      • Video
      My playlists
        Bookmarks
          00:00:00
            Enforcing Hard Constraints with Soft Barriers: Safe Reinforcement Learning in Unknown Stochastic Environments
            • Settings
            • Sync diff
            • Quality
            • Settings
            • Server
            • Quality
            • Server

            Enforcing Hard Constraints with Soft Barriers: Safe Reinforcement Learning in Unknown Stochastic Environments

            Jul 24, 2023

            Sprecher:innen

            YW

            Yixuan Wang

            Sprecher:in · 0 Follower:innen

            SSZ

            Simon Sinong Zhan

            Sprecher:in · 0 Follower:innen

            RJ

            Ruochen Jiao

            Sprecher:in · 0 Follower:innen

            Über

            It is quite challenging to ensure the safety of reinforcement learning (RL) agents in an unknown and stochastic environment under hard constraints that require the system state not to reach certain specified unsafe regions. Many popular safe RL methods such as those based on the Constrained Markov Decision Process (CMDP) paradigm formulate safety violations in a cost function and try to constrain the expectation of cumulative cost under a threshold. However, it is often difficult to effectively…

            Organisator

            I2
            I2

            ICML 2023

            Konto · 657 Follower:innen

            Gefällt euch das Format? Vertraut auf SlidesLive, um euer nächstes Event festzuhalten!

            Professionelle Aufzeichnung und Livestreaming – weltweit.

            Freigeben

            Empfohlene Videos

            Präsentationen, deren Thema, Kategorie oder Sprecher:in ähnlich sind

            Efficient Bound of Lipschitz Constant for Convolutional Layers by Gram Iteration
            04:46

            Efficient Bound of Lipschitz Constant for Convolutional Layers by Gram Iteration

            Blaise Delattre, …

            I2
            I2
            ICML 2023 2 years ago

            Ewigspeicher-Fortschrittswert: 0 = 0.0%

            Invited Talk: Miles Brundage
            18:59

            Invited Talk: Miles Brundage

            Miles Brundage

            I2
            I2
            ICML 2023 2 years ago

            Ewigspeicher-Fortschrittswert: 0 = 0.0%

            Vertical Federated Graph Neural Network for Recommender System
            04:59

            Vertical Federated Graph Neural Network for Recommender System

            Peihua Mai, …

            I2
            I2
            ICML 2023 2 years ago

            Ewigspeicher-Fortschrittswert: 0 = 0.0%

            Weighted Tallying Bandits: Overcoming Intractability via Repeated Exposure Optimality
            04:08

            Weighted Tallying Bandits: Overcoming Intractability via Repeated Exposure Optimality

            Dhruv Malik, …

            I2
            I2
            ICML 2023 2 years ago

            Ewigspeicher-Fortschrittswert: 0 = 0.0%

            Forget Unlearning: Towards True Data-Deletion in Machine Learning
            05:04

            Forget Unlearning: Towards True Data-Deletion in Machine Learning

            Rishav Chourasia, …

            I2
            I2
            ICML 2023 2 years ago

            Ewigspeicher-Fortschrittswert: 0 = 0.0%

            Improved Analysis of Score-based Generative Modeling: User-Friendly Bounds under Minimal Smoothness Assumptions
            05:19

            Improved Analysis of Score-based Generative Modeling: User-Friendly Bounds under Minimal Smoothness Assumptions

            Hongrui Chen, …

            I2
            I2
            ICML 2023 2 years ago

            Ewigspeicher-Fortschrittswert: 0 = 0.0%

            Interessiert an Vorträgen wie diesem? ICML 2023 folgen