Jul 24, 2023
Speaker · 0 followers
Speaker · 0 followers
Despite being a fundamental building block for reinforcement learning, Markov decision processes (MDPs) often suffer from ambiguity in model parameters. Robust MDPs are proposed to overcome this challenge by optimizing the worst-case performance under ambiguity. While robust models can provide reliable policies with limited data, the optimal worst-case performances are often overly conservative, and so they do not offer practical insights into the actual performance of these reliable policies. This paper proposes robust satisficing MDPs (RSMDPs), where the expected returns of feasible policies are soft-constrained to achieve a user-specified target under ambiguity. We derive a tractable reformulation for RSMDPs and develop a first-order method for solving large problems. Experimental results demonstrate that RSMDPs provide policies that can achieve their targets, which are much higher than the optimal worst-case returns computed by robust MDPs. Moreover, the average and percentile performances of our proposed model are competitive among other models. We also demonstrate the scalability of the proposed algorithm compared with a state-of-the-art commercial solver.Despite being a fundamental building block for reinforcement learning, Markov decision processes (MDPs) often suffer from ambiguity in model parameters. Robust MDPs are proposed to overcome this challenge by optimizing the worst-case performance under ambiguity. While robust models can provide reliable policies with limited data, the optimal worst-case performances are often overly conservative, and so they do not offer practical insights into the actual performance of these reliable policies. T…
Professional recording and live streaming, delivered globally.
Presentations on similar topic, category or speaker
Rahul Ramesh, …
Runzhe Wu, …
Will Dorrell, …