Distribution Matching and Mutual Information in Reinforcement Learning

by · Jun 14, 2019 · 222 views ·

ICML 2019

Conventionally, reinforcement learning is considered to be a framework for optimization: the aim for standard reinforcement learning algorithms is to recover an optimal or near-optimal policy that maximizes the reward over time. However, when considering more advanced reinforcement learning problems, from inverse reinforcement learning to unsupervised and hierarchical reinforcement learning, we often encounter settings where it is desirable to learn policies that match target distributions over trajectories or states, covering all modes, or else to simply learn collections of behaviors that are as broad and varied as possible. Information theory and probabilistic inference offer is a powerful set of tools for developing algorithms for these kinds of distribution matching problems. In this talk, I will outline methods that combine reinforcement learning, inference, and information theory to learn policies that match target distributions and acquire diverse behaviors, and discuss the applications of such methods for a variety of problems in artificial intelligence and robotics.