2. prosince 2022
Sprecher:in · 0 Follower:innen
Sprecher:in · 0 Follower:innen
Sprecher:in · 0 Follower:innen
Generalization in reinforcement learning (RL) is of importance for real deployment of RL algorithms. Various schemes are proposed to address the generalization issues, including transfer learning, multi-task learning, meta learning, as well as robust and adversarial reinforcement learning. However, there is not a unified formulation of various schemes and comprehensive comparisons of methods across different schemes. In this work, we propound GiRL, a game-theoretic framework for generalization in reinforcement learning, where a RL agent is trained against an adversary over a set of tasks, over which the adversary can manipulate the distributions within a given threshold. With different configurations, GiRL is capable of reducing the various schemes mentioned above. To solve GiRL, we adapt the widely-used method in game theory, policy space response oracle (PSRO) framework with three significant modifications as follows: i) we adopt model-agnostic meta learning (MAML) as the best-response oracle, ii) we propose a modified projected replicated dynamics, i.e., R-PRD, which ensures the computed meta-strategy for the adversary falls in the threshold, and iii) we also propose a protocol of few-shot learning for multiple strategies during testing. Extensive experiments on MuJoCo environments demonstrate that our proposed method outperforms state-of-the-art baselines, e.g., MAML.Generalization in reinforcement learning (RL) is of importance for real deployment of RL algorithms. Various schemes are proposed to address the generalization issues, including transfer learning, multi-task learning, meta learning, as well as robust and adversarial reinforcement learning. However, there is not a unified formulation of various schemes and comprehensive comparisons of methods across different schemes. In this work, we propound GiRL, a game-theoretic framework for generalization i…
Konto · 960 Follower:innen
Profesionální natáčení a streamování po celém světě.
Prezentace na podobné téma, kategorii nebo přednášejícího
Ewigspeicher-Fortschrittswert: 0 = 0.0%
Ewigspeicher-Fortschrittswert: 0 = 0.0%
Guiliang Liu, …
Ewigspeicher-Fortschrittswert: 0 = 0.0%
Ewigspeicher-Fortschrittswert: 0 = 0.0%
Qiao Xiao, …
Ewigspeicher-Fortschrittswert: 0 = 0.0%
Arya Akhavan, …
Ewigspeicher-Fortschrittswert: 0 = 0.0%