Jul 24, 2023
Speaker · 0 followers
Speaker · 0 followers
Speaker · 0 followers
Deep Reinforcement Learning (RL) has been adopting better scientific practices in order to improve reproducibility such as standardized evaluation metrics and reporting. However, the process of hyperparameter optimization still varies widely across papers, which makes it challenging to compare RL algorithms fairly . In this paper, we show that hyperparameter choices in RL can significantly affect the agent’s final performance and sample efficiency, and that the hyperparameter landscape can strongly depend on the tuning seed which my lead to overfitting to single seeds. We therefore propose adopting established best practices from AutoML, such as the separation of tuning and testing seeds, as well as principled hyperparameter optimization (HPO) across a broad search space. We support this by comparing multiple state-of-the-art HPO tools on a range of RL algorithms and environments to their hand-tuned counterparts, demonstrating that HPO approaches often have higher performance and lower compute overhead. As a result of our findings, we recommend a set of best practices for the RL community going forward, which should result in stronger empirical results with fewer computational costs, better reproducibility, and thus faster progress in RL. In order to encourage the adoption of these practices, we release plug-and-play implementations of the tuning algorithms used in this paper.Deep Reinforcement Learning (RL) has been adopting better scientific practices in order to improve reproducibility such as standardized evaluation metrics and reporting. However, the process of hyperparameter optimization still varies widely across papers, which makes it challenging to compare RL algorithms fairly . In this paper, we show that hyperparameter choices in RL can significantly affect the agent’s final performance and sample efficiency, and that the hyperparameter landscape can stron…
Professional recording and live streaming, delivered globally.
Presentations on similar topic, category or speaker
Mark Vero, …
Phillip Si, …
Ziluo Ding, …