Chenyang Wu, Tianci Li, Zongzhang Zhang, Yang Yu · Bayesian Optimistic Optimization: Optimistic Exploration for Model-based Reinforcement Learning · SlidesLive

Categories

EN

Log in Get an estimate

Bayesian Optimistic Optimization: Optimistic Exploration for Model-based Reinforcement Learning

Dec 6, 2022

Speakers

About

Reinforcement learning (RL) is a general framework for modeling sequential decision making problems, at the core of which lies the dilemma of exploitation and exploration. An agent failing to explore systematically will inevitably fail to learn efficiently. Optimism in the face of uncertainty (OFU) is a conventionally successful strategy for efficient exploration. An agent following the OFU principle explores actively and efficiently. However, when applied to model-based RL, it involves specifying a confidence set of the underlying model and solving a series of nonlinear constrained optimization, which can be computationally intractable. This paper proposes an algorithm, Bayesian optimistic optimization (BOO), which adopts a dynamic weighting technique for enforcing the constraint rather than explicitly solving a constrained optimization problem. BOO is a general algorithm shown to be sample-efficient for finite spectrum RKHS. We also developed effective optimization techniques based on natural gradients and entropy regularization.

Organizer

Store presentation

Should this presentation be stored for 1000 years?

How do we store presentations

Sharing

Recommended Videos

Presentations on similar topic, category or speaker