Semi-Supervised Learning with Meta-Gradient

In this work, we propose a simple yet effective meta-learning algorithm in semi-supervised learning. We notice that most existing consistency-based approaches suffer from overfitting and limited model generalization ability, especially when training with only a small number of labeled data. To alleviate this issue, we propose a learn-to-generalize regularization term by utilizing the label information and optimize the problem in a meta-learning fashion. Specifically, we seek the pseudo labels of the unlabeled data so that the model can generalize well on the labeled data, which is formulated as a nested optimization problem. We address this problem using the meta-gradient that bridges between the pseudo label and the regularization term. In addition, we introduce a simple first-order approximation to avoid computing higher-order derivatives and provide theoretic convergence analysis. Extensive evaluations on the SVHN, CIFAR, and ImageNet datasets demonstrate that the proposed algorithm performs favorably against state-of-the-art methods.