Wakening Past Concepts without Past Data: Class-Incremental Learning
from Online Placebos
- URL: http://arxiv.org/abs/2310.16115v1
- Date: Tue, 24 Oct 2023 18:32:46 GMT
- Title: Wakening Past Concepts without Past Data: Class-Incremental Learning
from Online Placebos
- Authors: Yaoyao Liu, Yingying Li, Bernt Schiele, Qianru Sun
- Abstract summary: We find that "using new class data for KD" not only hinders the model adaption (for learning new classes) but also results in low efficiency for preserving old class knowledge.
We address this by "using the placebos of old classes for KD", where the placebos are chosen from a free image stream, such as Google Images, in an automatical and economical fashion.
- Score: 85.37515663416691
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Not forgetting old class knowledge is a key challenge for class-incremental
learning (CIL) when the model continuously adapts to new classes. A common
technique to address this is knowledge distillation (KD), which penalizes
prediction inconsistencies between old and new models. Such prediction is made
with almost new class data, as old class data is extremely scarce due to the
strict memory limitation in CIL. In this paper, we take a deep dive into KD
losses and find that "using new class data for KD" not only hinders the model
adaption (for learning new classes) but also results in low efficiency for
preserving old class knowledge. We address this by "using the placebos of old
classes for KD", where the placebos are chosen from a free image stream, such
as Google Images, in an automatical and economical fashion. To this end, we
train an online placebo selection policy to quickly evaluate the quality of
streaming images (good or bad placebos) and use only good ones for one-time
feed-forward computation of KD. We formulate the policy training process as an
online Markov Decision Process (MDP), and introduce an online learning
algorithm to solve this MDP problem without causing much computation costs. In
experiments, we show that our method 1) is surprisingly effective even when
there is no class overlap between placebos and original old class data, 2) does
not require any additional supervision or memory budget, and 3) significantly
outperforms a number of top-performing CIL methods, in particular when using
lower memory budgets for old class exemplars, e.g., five exemplars per class.
Related papers
- Towards Non-Exemplar Semi-Supervised Class-Incremental Learning [33.560003528712414]
Class-incremental learning aims to gradually recognize new classes while maintaining the discriminability of old ones.
We propose a non-exemplar semi-supervised CIL framework with contrastive learning and semi-supervised incremental prototype classifier (Semi-IPC)
Semi-IPC learns a prototype for each class with unsupervised regularization, enabling the model to incrementally learn from partially labeled new data.
arXiv Detail & Related papers (2024-03-27T06:28:19Z) - Adapt Your Teacher: Improving Knowledge Distillation for Exemplar-free
Continual Learning [14.379472108242235]
We investigate exemplar-free class incremental learning (CIL) with knowledge distillation (KD) as a regularization strategy.
KD-based methods are successfully used in CIL, but they often struggle to regularize the model without access to exemplars of the training data from previous tasks.
Inspired by recent test-time adaptation methods, we introduce Teacher Adaptation (TA), a method that concurrently updates the teacher and the main models during incremental training.
arXiv Detail & Related papers (2023-08-18T13:22:59Z) - RanPAC: Random Projections and Pre-trained Models for Continual Learning [59.07316955610658]
Continual learning (CL) aims to learn different tasks (such as classification) in a non-stationary data stream without forgetting old ones.
We propose a concise and effective approach for CL with pre-trained models.
arXiv Detail & Related papers (2023-07-05T12:49:02Z) - Class-Incremental Learning: A Survey [84.30083092434938]
Class-Incremental Learning (CIL) enables the learner to incorporate the knowledge of new classes incrementally.
CIL tends to catastrophically forget the characteristics of former ones, and its performance drastically degrades.
We provide a rigorous and unified evaluation of 17 methods in benchmark image classification tasks to find out the characteristics of different algorithms.
arXiv Detail & Related papers (2023-02-07T17:59:05Z) - Online Hyperparameter Optimization for Class-Incremental Learning [99.70569355681174]
Class-incremental learning (CIL) aims to train a classification model while the number of classes increases phase-by-phase.
An inherent challenge of CIL is the stability-plasticity tradeoff, i.e., CIL models should keep stable to retain old knowledge and keep plastic to absorb new knowledge.
We propose an online learning method that can adaptively optimize the tradeoff without knowing the setting as a priori.
arXiv Detail & Related papers (2023-01-11T17:58:51Z) - Decomposed Knowledge Distillation for Class-Incremental Semantic
Segmentation [34.460973847554364]
Class-incremental semantic segmentation (CISS) labels each pixel of an image with a corresponding object/stuff class continually.
It is crucial to learn novel classes incrementally without forgetting previously learned knowledge.
We introduce a CISS framework that alleviates the forgetting problem and facilitates learning novel classes effectively.
arXiv Detail & Related papers (2022-10-12T06:15:51Z) - Always Be Dreaming: A New Approach for Data-Free Class-Incremental
Learning [73.24988226158497]
We consider the high-impact problem of Data-Free Class-Incremental Learning (DFCIL)
We propose a novel incremental distillation strategy for DFCIL, contributing a modified cross-entropy training and importance-weighted feature distillation.
Our method results in up to a 25.1% increase in final task accuracy (absolute difference) compared to SOTA DFCIL methods for common class-incremental benchmarks.
arXiv Detail & Related papers (2021-06-17T17:56:08Z) - Undistillable: Making A Nasty Teacher That CANNOT teach students [84.6111281091602]
This paper introduces and investigates a concept called Nasty Teacher: a specially trained teacher network that yields nearly the same performance as a normal one.
We propose a simple yet effective algorithm to build the nasty teacher, called self-undermining knowledge distillation.
arXiv Detail & Related papers (2021-05-16T08:41:30Z) - ClaRe: Practical Class Incremental Learning By Remembering Previous
Class Representations [9.530976792843495]
Class Incremental Learning (CIL) tends to learn new concepts perfectly, but not at the expense of performance and accuracy for old data.
ClaRe is an efficient solution for CIL by remembering the representations of learned classes in each increment.
ClaRe has a better generalization than prior methods thanks to producing diverse instances from the distribution of previously learned classes.
arXiv Detail & Related papers (2021-03-29T10:39:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.