Dec 6, 2021
Speaker · 0 followers
This work studies the statistical limits of uniform convergence for offline policy evaluation (OPE) problems with model-based methods (for episodic MDP) and provides a unified framework towards optimal learning for several well-motivated offline tasks. Uniform OPE sup_Π|Q^π-Q̂^π|<ϵ is a stronger measure than the point-wise OPE and ensures offline learning when Π contains all policies (the global class). In this paper, we establish an Ω(H^2 S/d_mϵ^2) lower bound (over model-based family) for the global uniform OPE and our main result establishes an upper bound of Õ(H^2/d_mϵ^2) for the local uniform convergence that applies to all near-empirically optimal policies for the MDPs with stationary transition. Here d_m is the minimal marginal state-action probability. Critically, the highlight in achieving the optimal rate Õ(H^2/d_mϵ^2) is our design of singleton absorbing MDP, which is a new sharp analysis tool that works with the model-based approach. We generalize such a model-based framework to the new settings: offline task-agnostic and the offline reward-free with optimal complexity Õ(H^2log(K)/d_mϵ^2) (K is the number of tasks) and Õ(H^2S/d_mϵ^2) respectively. These results provide a unified solution for simultaneously solving different offline RL problems.This work studies the statistical limits of uniform convergence for offline policy evaluation (OPE) problems with model-based methods (for episodic MDP) and provides a unified framework towards optimal learning for several well-motivated offline tasks. Uniform OPE sup_Π|Q^π-Q̂^π|<ϵ is a stronger measure than the point-wise OPE and ensures offline learning when Π contains all policies (the global class). In this paper, we establish an Ω(H^2 S/d_mϵ^2) lower bound (over model-based family) for …
Account · 1.9k followers
Neural Information Processing Systems (NeurIPS) is a multi-track machine learning and computational neuroscience conference that includes invited talks, demonstrations, symposia and oral and poster presentations of refereed papers. Following the conference, there are workshops which provide a less formal setting.
Professional recording and live streaming, delivered globally.
Presentations on similar topic, category or speaker
Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%
Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%
Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%
Nima Dehmamy, …
Total of 1 viewers voted for saving the presentation to eternal vault which is 0.1%
Yana Dranker, …
Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%
Jiashuo Liu, …
Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%