Learning to Predict Gradients for Semi-Supervised Continual Learning
- URL: http://arxiv.org/abs/2201.09196v2
- Date: Wed, 31 Jan 2024 05:30:08 GMT
- Title: Learning to Predict Gradients for Semi-Supervised Continual Learning
- Authors: Yan Luo, Yongkang Wong, Mohan Kankanhalli, Qi Zhao
- Abstract summary: Key challenge for machine intelligence is to learn new visual concepts without forgetting the previously acquired knowledge.
There is a gap between existing supervised continual learning and human-like intelligence, where human is able to learn from both labeled and unlabeled data.
We formulate a new semi-supervised continual learning method, which can be generically applied to existing continual learning models.
- Score: 36.715712711431856
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A key challenge for machine intelligence is to learn new visual concepts
without forgetting the previously acquired knowledge. Continual learning is
aimed towards addressing this challenge. However, there is a gap between
existing supervised continual learning and human-like intelligence, where human
is able to learn from both labeled and unlabeled data. How unlabeled data
affects learning and catastrophic forgetting in the continual learning process
remains unknown. To explore these issues, we formulate a new semi-supervised
continual learning method, which can be generically applied to existing
continual learning models. Specifically, a novel gradient learner learns from
labeled data to predict gradients on unlabeled data. Hence, the unlabeled data
could fit into the supervised continual learning method. Different from
conventional semi-supervised settings, we do not hypothesize that the
underlying classes, which are associated to the unlabeled data, are known to
the learning process. In other words, the unlabeled data could be very distinct
from the labeled data. We evaluate the proposed method on mainstream continual
learning, adversarial continual learning, and semi-supervised learning tasks.
The proposed method achieves state-of-the-art performance on classification
accuracy and backward transfer in the continual learning setting while
achieving desired performance on classification accuracy in the semi-supervised
learning setting. This implies that the unlabeled images can enhance the
generalizability of continual learning models on the predictive ability on
unseen data and significantly alleviate catastrophic forgetting. The code is
available at \url{https://github.com/luoyan407/grad_prediction.git}.
Related papers
- Premonition: Using Generative Models to Preempt Future Data Changes in
Continual Learning [63.850451635362425]
Continual learning requires a model to adapt to ongoing changes in the data distribution.
We show that the combination of a large language model and an image generation model can similarly provide useful premonitions.
We find that the backbone of our pre-trained networks can learn representations useful for the downstream continual learning problem.
arXiv Detail & Related papers (2024-03-12T06:29:54Z) - Learn to Unlearn for Deep Neural Networks: Minimizing Unlearning
Interference with Gradient Projection [56.292071534857946]
Recent data-privacy laws have sparked interest in machine unlearning.
Challenge is to discard information about the forget'' data without altering knowledge about remaining dataset.
We adopt a projected-gradient based learning method, named as Projected-Gradient Unlearning (PGU)
We provide empirically evidence to demonstrate that our unlearning method can produce models that behave similar to models retrained from scratch across various metrics even when the training dataset is no longer accessible.
arXiv Detail & Related papers (2023-12-07T07:17:24Z) - Towards Label-Efficient Incremental Learning: A Survey [42.603603392991715]
We study incremental learning, where a learner is required to adapt to an incoming stream of data with a varying distribution.
We identify three subdivisions, namely semi-, few-shot- and self-supervised learning to reduce labeling efforts.
arXiv Detail & Related papers (2023-02-01T10:24:55Z) - From Weakly Supervised Learning to Active Learning [1.52292571922932]
This thesis is motivated by the question: can we derive a more generic framework than the one of supervised learning?
We model weak supervision as giving, rather than a unique target, a set of target candidates.
We argue that one should look for an optimistic'' function that matches most of the observations. This allows us to derive a principle to disambiguate partial labels.
arXiv Detail & Related papers (2022-09-23T14:55:43Z) - Continual Learning with Bayesian Model based on a Fixed Pre-trained
Feature Extractor [55.9023096444383]
Current deep learning models are characterised by catastrophic forgetting of old knowledge when learning new classes.
Inspired by the process of learning new knowledge in human brains, we propose a Bayesian generative model for continual learning.
arXiv Detail & Related papers (2022-04-28T08:41:51Z) - A Meta-Learned Neuron model for Continual Learning [0.0]
Continual learning is the ability to acquire new knowledge without forgetting the previously learned one.
In this work, we replace the standard neuron by a meta-learned neuron model.
Our approach can memorize dataset-length sequences of training samples, and its learning capabilities generalize to any domain.
arXiv Detail & Related papers (2021-11-03T23:39:14Z) - Online Continual Learning with Natural Distribution Shifts: An Empirical
Study with Visual Data [101.6195176510611]
"Online" continual learning enables evaluating both information retention and online learning efficacy.
In online continual learning, each incoming small batch of data is first used for testing and then added to the training set, making the problem truly online.
We introduce a new benchmark for online continual visual learning that exhibits large scale and natural distribution shifts.
arXiv Detail & Related papers (2021-08-20T06:17:20Z) - Deep Bayesian Unsupervised Lifelong Learning [3.4827140757744908]
We focus on resolving challenges in Unsupervised Lifelong Learning (ULL) with streaming unlabelled data.
We develop a fully Bayesian inference framework for ULL with a novel end-to-end Deep Bayesian Unsupervised Lifelong Learning (DBULL) algorithm.
To efficiently maintain past knowledge, we develop a novel knowledge preservation mechanism via sufficient statistics of the latent representation for raw data.
arXiv Detail & Related papers (2021-06-13T16:24:44Z) - Self-Supervised Learning Aided Class-Incremental Lifelong Learning [17.151579393716958]
We study the issue of catastrophic forgetting in class-incremental learning (Class-IL)
In training procedure of Class-IL, as the model has no knowledge about following tasks, it would only extract features necessary for tasks learned so far, whose information is insufficient for joint classification.
We propose to combine self-supervised learning, which can provide effective representations without requiring labels, with Class-IL to partly get around this problem.
arXiv Detail & Related papers (2020-06-10T15:15:27Z) - Exploratory Machine Learning with Unknown Unknowns [60.78953456742171]
We study a new problem setting in which there are unknown classes in the training data misperceived as other labels.
We propose the exploratory machine learning, which examines and investigates training data by actively augmenting the feature space to discover potentially hidden classes.
arXiv Detail & Related papers (2020-02-05T02:06:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.