Nov 28, 2022
Speaker · 1 follower
Speaker · 0 followers
Speaker · 0 followers
Speaker · 0 followers
The dynamics by which neural networks learn and forget examples throughout training has emerged as an object of interest along several threads of research. In particular, researchers have proposed metrics of example hardness based on these dynamics, including (i) the epoch at which examples are first correctly classified; (ii) the number of times their predictions flip during training; and (iii) whether their prediction flips if they are held out. However, an example might be considered hard for several distinct reasons, such as being a member of a rare subpopulation, being mislabeled, or being fundamentally ambiguous in their class. In this paper, we focus on the second-split forgetting time (SSFT): the epoch (if any) after which an original training example is forgotten as the network is fine-tuned on a randomly held out partition of the data. Across multiple benchmark datasets and modalities, we demonstrate that mislabeled examples are forgotten quickly, and seemingly rare examples are forgotten comparatively slowly. By contrast, metrics only considering the first split learning dynamics struggle to differentiate the two. Additionally, the SSFT tends to be robust to the choice of architecture, optimizer, and random seed. From a practical standpoint, the SSFT (i) can help to identify mislabeled samples, the removal of which improves generalization; and (ii) can provide insights about failure modes. Through theoretical analysis addressing overparameterized linear models, we provide insights into how the observed phenomena may arise.The dynamics by which neural networks learn and forget examples throughout training has emerged as an object of interest along several threads of research. In particular, researchers have proposed metrics of example hardness based on these dynamics, including (i) the epoch at which examples are first correctly classified; (ii) the number of times their predictions flip during training; and (iii) whether their prediction flips if they are held out. However, an example might be considered hard for…
Account · 954 followers
Professional recording and live streaming, delivered globally.
Presentations on similar topic, category or speaker
Gaoyue Zhou, …
Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%
Yuzhou Chen, …
Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%
Junting Pan, …
Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%
Fukun Yin, …
Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%
Anthony Hu, …
Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%
Tae-Gyun Lee, …
Total of 0 viewers voted for saving the presentation to eternal vault which is 0.0%