Dec 6, 2021
Speaker · 0 followers
Speaker · 0 followers
Speaker · 0 followers
We present Coordinated Proximal Policy Optimization (CoPPO), an algorithm that extends the original Proximal Policy Optimization (PPO) to the multi-agent setting. The key idea lies in the coordinated adaptation of step size during the policy update process among agents. We prove the monotonicity of policy improvement when optimizing a theoretically-grounded joint objective, and derive a simplified optimization objective based on a set of approximations. We then interpret that such an objective in CoPPO can achieve dynamic credit assignment among agents, thereby alleviating the high variance issue during the concurrent update of agent policies. Finally, we demonstrate that CoPPO achieves state-of-the-art performance against a number of strong baselines under typical multi-agent settings, including a cooperative matrix game and the StarCraft II micromanagement task.We present Coordinated Proximal Policy Optimization (CoPPO), an algorithm that extends the original Proximal Policy Optimization (PPO) to the multi-agent setting. The key idea lies in the coordinated adaptation of step size during the policy update process among agents. We prove the monotonicity of policy improvement when optimizing a theoretically-grounded joint objective, and derive a simplified optimization objective based on a set of approximations. We then interpret that such an objective i…
Account · 1.9k followers
Neural Information Processing Systems (NeurIPS) is a multi-track machine learning and computational neuroscience conference that includes invited talks, demonstrations, symposia and oral and poster presentations of refereed papers. Following the conference, there are workshops which provide a less formal setting.
Professional recording and live streaming, delivered globally.
Presentations on similar topic, category or speaker
Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%
Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%
Jyoti Aneja, …
Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%
Andersen Ang, …
Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%
Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%
Tao Wang, …
Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%