Dec 2, 2022
Sprecher:in · 0 Follower:innen
Sprecher:in · 0 Follower:innen
Sprecher:in · 0 Follower:innen
Multi-objective reinforcement learning (MORL) approaches have emerged to tackle many real-world problems with multiple conflicting objectives by maximizing a joint objective function weighted by a preference vector. These approaches find fixed customized policies corresponding to preference vectors specified during training. However, the design constraints and objectives typically change dynamically in real-life scenarios. Furthermore, storing a policy for each potential preference is not scalable. Hence, obtaining a set of Pareto front solutions for the entire preference space in a given domain with a single training is critical. To this end, we propose a novel MORL algorithm that trains a single universal network to cover the entire preference space scalable to continuous robotic tasks. The proposed approach, Preference-Driven MORL (PD-MORL), utilizes the preferences as guidance to update the network parameters. It also employs a novel parallelization approach to increase sample efficiency. We show that PD-MORL achieves up to 25% larger hypervolume for challenging continuous control tasks compared to prior approaches using an order of magnitude fewer trainable parameters while achieving broad and dense Pareto front solutions.Multi-objective reinforcement learning (MORL) approaches have emerged to tackle many real-world problems with multiple conflicting objectives by maximizing a joint objective function weighted by a preference vector. These approaches find fixed customized policies corresponding to preference vectors specified during training. However, the design constraints and objectives typically change dynamically in real-life scenarios. Furthermore, storing a policy for each potential preference is not scalab…
Konto · 961 Follower:innen
Professional recording and live streaming, delivered globally.
Presentations on similar topic, category or speaker
Ewigspeicher-Fortschrittswert: 0 = 0.0%
Jiaxuan Li, …
Ewigspeicher-Fortschrittswert: 0 = 0.0%
Le Hui, …
Ewigspeicher-Fortschrittswert: 0 = 0.0%
Keqiang Yan, …
Ewigspeicher-Fortschrittswert: 0 = 0.0%
Fanghui Liu, …
Ewigspeicher-Fortschrittswert: 0 = 0.0%
Shuang Li, …
Ewigspeicher-Fortschrittswert: 0 = 0.0%