Nov 28, 2022
Speaker · 0 followers
Speaker · 0 followers
Speaker · 0 followers
Speaker · 1 follower
Speaker · 0 followers
Speaker · 0 followers
Speaker · 0 followers
We present the problem of reinforcement learning with exogenous termination. We define the Termination Markov Decision Process (TerMDP), an extension of the MDP framework, in which episodes may be interrupted by an external non-Markovian observer. This formulation accounts for numerous real-world situations, such as a human interrupting an autonomous driving agent for reasons of discomfort. We learn the parameters of the TerMDP and leverage the structure of the estimation problem to provide state-wise confidence bounds. We use these to construct a provably-efficient algorithm, which accounts for termination, and bound its regret. Motivated by our theoretical analysis, we design and implement a scalable approach, which combines optimism (w.r.t. termination) and a dynamic discount factor, incorporating the termination probability. We deploy our method on high-dimensional driving and MinAtar benchmarks. Additionally, we test our approach on human data in a driving setting. Our results demonstrate fast convergence and significant improvement over various baseline approaches.We present the problem of reinforcement learning with exogenous termination. We define the Termination Markov Decision Process (TerMDP), an extension of the MDP framework, in which episodes may be interrupted by an external non-Markovian observer. This formulation accounts for numerous real-world situations, such as a human interrupting an autonomous driving agent for reasons of discomfort. We learn the parameters of the TerMDP and leverage the structure of the estimation problem to provide stat…
Account · 952 followers
Professional recording and live streaming, delivered globally.
Presentations on similar topic, category or speaker
Andrew Szot, …
Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%
Bing Su, …
Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%
Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%
Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%
Yujia Xie, …
Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%
Yuefan Wu, …
Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%