Chen Changyu, Ramesha Karunasena, Thanh Hong Yguyen, Arunesh Sinha, Pradeep Varakantham · Generative Modelling of Stochastic Actions with Arbitrary Constraints in Reinforcement Learning · SlidesLive

Categories

EN

Log in Get an estimate

Generative Modelling of Stochastic Actions with Arbitrary Constraints in Reinforcement Learning

Dec 10, 2023

Speakers

About

Many problems in Reinforcement Learning (RL) have an optimal policy that is stochastic; these include problems in randomized allocation of resource such as placement of security resources, emergency response units, etc. A challenge in this setting is that the underlying action space is categorical (discrete and unordered) and large. Existing RL methods do not perform well in such large categorical action spaces. Also, these problems require validity of the realized action (allocation); this validity constraint is often difficult to express compactly in a closed mathematical form. In this work, we address these issues by (1) using a (state) conditional normalizing flow to compactly represent the stochastic policy; the compactness arises due to the network only producing one sampled action and log probability of the action, which is then used by an actor-critic method. (2) using an invalid action rejection method (by using a valid action oracle) to modify the base policy. The action rejection is enabled by a modified policy gradient that we derive. We show the scalability of our approach compared to prior methods and the ability to enforce arbitrary state-conditional constraints on the support of the distribution of actions in any state in our experiments.

Organizer

Like the format? Trust SlidesLive to capture your next event!

Professional recording and live streaming, delivered globally.

Sharing

Recommended Videos

Presentations on similar topic, category or speaker