Jul 24, 2023
Speaker · 0 followers
Speaker · 0 followers
Speaker · 0 followers
Speaker · 2 followers
Proposed by Hinton et al. in 2012, dropout has stood the test of time as a regularizer for alleviating neural net overfitting. In this work, we show how dropout can also reduce underfitting, when used at the start of training. At this phase, dropout reduces the gradient variance across mini-batches and helps align the mini-batch gradients with the underlying whole-dataset gradient. Intuitively, dropout counteracts SGD data stochasticity and limits the influence of individual batches on the model. This insight leads us to a clear solution for improving performance in underfitting models - early dropout: dropout is used only during the initial phases of training, and switched off afterwards. Compared with their no-dropout or standard-dropout counterparts, models equipped with early dropout achieve lower final training loss. Building on this idea, we also consider a symmetric technique for regularizing overfitting models - late dropout, where dropout is not used in the early iterations, and is activated only for remaining ones. Experiments on ImageNet and downstream vision tasks show our methods consistently improve generalization accuracy. Our findings encourage more research on understanding regularization in deep learning, and we hope our methods will be helpful tools for future neural network training.Proposed by Hinton et al. in 2012, dropout has stood the test of time as a regularizer for alleviating neural net overfitting. In this work, we show how dropout can also reduce underfitting, when used at the start of training. At this phase, dropout reduces the gradient variance across mini-batches and helps align the mini-batch gradients with the underlying whole-dataset gradient. Intuitively, dropout counteracts SGD data stochasticity and limits the influence of individual batches on the model…
Professional recording and live streaming, delivered globally.
Presentations on similar topic, category or speaker
Hui Shi, …
Yuqing Du, …
Sungbin Shin, …
Liren Yu, …