Self-Adaptive Training: Bridging the Supervised and Self-Supervised
Learning
- URL: http://arxiv.org/abs/2101.08732v1
- Date: Thu, 21 Jan 2021 17:17:30 GMT
- Title: Self-Adaptive Training: Bridging the Supervised and Self-Supervised
Learning
- Authors: Lang Huang, Chao Zhang and Hongyang Zhang
- Abstract summary: Self-adaptive training is a unified training algorithm that dynamically calibrates and enhances training process by model predictions without incurring extra computational cost.
We analyze the training dynamics of deep networks on training data corrupted by, e.g., random noise and adversarial examples.
Our analysis shows that model predictions are able to magnify useful underlying information in data and this phenomenon occurs broadly even in the absence of emphany label information.
- Score: 16.765461276790944
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We propose self-adaptive training -- a unified training algorithm that
dynamically calibrates and enhances training process by model predictions
without incurring extra computational cost -- to advance both supervised and
self-supervised learning of deep neural networks. We analyze the training
dynamics of deep networks on training data that are corrupted by, e.g., random
noise and adversarial examples. Our analysis shows that model predictions are
able to magnify useful underlying information in data and this phenomenon
occurs broadly even in the absence of \emph{any} label information,
highlighting that model predictions could substantially benefit the training
process: self-adaptive training improves the generalization of deep networks
under noise and enhances the self-supervised representation learning. The
analysis also sheds light on understanding deep learning, e.g., a potential
explanation of the recently-discovered double-descent phenomenon in empirical
risk minimization and the collapsing issue of the state-of-the-art
self-supervised learning algorithms. Experiments on the CIFAR, STL and ImageNet
datasets verify the effectiveness of our approach in three applications:
classification with label noise, selective classification and linear
evaluation. To facilitate future research, the code has been made public
available at https://github.com/LayneH/self-adaptive-training.
Related papers
- What Do Learning Dynamics Reveal About Generalization in LLM Reasoning? [83.83230167222852]
We find that a model's generalization behavior can be effectively characterized by a training metric we call pre-memorization train accuracy.
By connecting a model's learning behavior to its generalization, pre-memorization train accuracy can guide targeted improvements to training strategies.
arXiv Detail & Related papers (2024-11-12T09:52:40Z) - Dynamics of Supervised and Reinforcement Learning in the Non-Linear Perceptron [3.069335774032178]
We use a dataset-process approach to derive flow equations describing learning.
We characterize the effects of the learning rule (supervised or reinforcement learning, SL/RL) and input-data distribution on the perceptron's learning curve.
This approach points a way toward analyzing learning dynamics for more-complex circuit architectures.
arXiv Detail & Related papers (2024-09-05T17:58:28Z) - ALP: Action-Aware Embodied Learning for Perception [60.64801970249279]
We introduce Action-Aware Embodied Learning for Perception (ALP)
ALP incorporates action information into representation learning through a combination of optimizing a reinforcement learning policy and an inverse dynamics prediction objective.
We show that ALP outperforms existing baselines in several downstream perception tasks.
arXiv Detail & Related papers (2023-06-16T21:51:04Z) - DCLP: Neural Architecture Predictor with Curriculum Contrastive Learning [5.2319020651074215]
We propose a Curricumum-guided Contrastive Learning framework for neural Predictor (DCLP)
Our method simplifies the contrastive task by designing a novel curriculum to enhance the stability of unlabeled training data distribution.
We experimentally demonstrate that DCLP has high accuracy and efficiency compared with existing predictors.
arXiv Detail & Related papers (2023-02-25T08:16:21Z) - Reconstructing Training Data from Model Gradient, Provably [68.21082086264555]
We reconstruct the training samples from a single gradient query at a randomly chosen parameter value.
As a provable attack that reveals sensitive training data, our findings suggest potential severe threats to privacy.
arXiv Detail & Related papers (2022-12-07T15:32:22Z) - Understanding Self-Predictive Learning for Reinforcement Learning [61.62067048348786]
We study the learning dynamics of self-predictive learning for reinforcement learning.
We propose a novel self-predictive algorithm that learns two representations simultaneously.
arXiv Detail & Related papers (2022-12-06T20:43:37Z) - Harnessing the Power of Explanations for Incremental Training: A
LIME-Based Approach [6.244905619201076]
In this work, model explanations are fed back to the feed-forward training to help the model generalize better.
The framework incorporates the custom weighted loss with Elastic Weight Consolidation (EWC) to maintain performance in sequential testing sets.
The proposed custom training procedure results in a consistent enhancement of accuracy ranging from 0.5% to 1.5% throughout all phases of the incremental learning setup.
arXiv Detail & Related papers (2022-11-02T18:16:17Z) - How does unlabeled data improve generalization in self-training? A
one-hidden-layer theoretical analysis [93.37576644429578]
This work establishes the first theoretical analysis for the known iterative self-training paradigm.
We prove the benefits of unlabeled data in both training convergence and generalization ability.
Experiments from shallow neural networks to deep neural networks are also provided to justify the correctness of our established theoretical insights on self-training.
arXiv Detail & Related papers (2022-01-21T02:16:52Z) - Improving Deep Learning Interpretability by Saliency Guided Training [36.782919916001624]
Saliency methods have been widely used to highlight important input features in model predictions.
Most existing methods use backpropagation on a modified gradient function to generate saliency maps.
We introduce a saliency guided training procedure for neural networks to reduce noisy gradients used in predictions.
arXiv Detail & Related papers (2021-11-29T06:05:23Z) - Efficient Estimation of Influence of a Training Instance [56.29080605123304]
We propose an efficient method for estimating the influence of a training instance on a neural network model.
Our method is inspired by dropout, which zero-masks a sub-network and prevents the sub-network from learning each training instance.
We demonstrate that the proposed method can capture training influences, enhance the interpretability of error predictions, and cleanse the training dataset for improving generalization.
arXiv Detail & Related papers (2020-12-08T04:31:38Z) - Self-Adaptive Training: beyond Empirical Risk Minimization [15.59721834388181]
We propose a new training algorithm that dynamically corrects problematic labels by model predictions without incurring extra computational cost.
Self-adaptive training significantly improves generalization over various levels of noises, and mitigates the overfitting issue in both natural and adversarial training.
Experiments on CIFAR and ImageNet datasets verify the effectiveness of our approach in two applications.
arXiv Detail & Related papers (2020-02-24T15:47:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.