SS-IL: Separated Softmax for Incremental Learning
- URL: http://arxiv.org/abs/2003.13947v3
- Date: Tue, 21 Jun 2022 06:19:45 GMT
- Title: SS-IL: Separated Softmax for Incremental Learning
- Authors: Hongjoon Ahn, Jihwan Kwak, Subin Lim, Hyeonsu Bang, Hyojun Kim and
Taesup Moon
- Abstract summary: We consider class incremental learning (CIL) problem, in which a learning agent continuously learns new classes from incrementally arriving training data batches.
The main challenge of the problem is the catastrophic forgetting.
We propose a new method, dubbed as Separated Softmax for Incremental Learning (SS-IL), that consists of separated softmax (SS) output layer combined with task-wise knowledge distillation (TKD) to resolve such bias.
- Score: 15.161214516549844
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider class incremental learning (CIL) problem, in which a learning
agent continuously learns new classes from incrementally arriving training data
batches and aims to predict well on all the classes learned so far. The main
challenge of the problem is the catastrophic forgetting, and for the
exemplar-memory based CIL methods, it is generally known that the forgetting is
commonly caused by the classification score bias that is injected due to the
data imbalance between the new classes and the old classes (in the
exemplar-memory). While several methods have been proposed to correct such
score bias by some additional post-processing, e.g., score re-scaling or
balanced fine-tuning, no systematic analysis on the root cause of such bias has
been done. To that end, we analyze that computing the softmax probabilities by
combining the output scores for all old and new classes could be the main cause
of the bias. Then, we propose a new method, dubbed as Separated Softmax for
Incremental Learning (SS-IL), that consists of separated softmax (SS) output
layer combined with task-wise knowledge distillation (TKD) to resolve such
bias. Throughout our extensive experimental results on several large-scale CIL
benchmark datasets, we show our SS-IL achieves strong state-of-the-art accuracy
through attaining much more balanced prediction scores across old and new
classes, without any additional post-processing.
Related papers
- Gradient Reweighting: Towards Imbalanced Class-Incremental Learning [8.438092346233054]
Class-Incremental Learning (CIL) trains a model to continually recognize new classes from non-stationary data.
A major challenge of CIL arises when applying to real-world data characterized by non-uniform distribution.
We show that this dual imbalance issue causes skewed gradient updates with biased weights in FC layers, thus inducing over/under-fitting and catastrophic forgetting in CIL.
arXiv Detail & Related papers (2024-02-28T18:08:03Z) - Few-Shot Class-Incremental Learning with Prior Knowledge [94.95569068211195]
We propose Learning with Prior Knowledge (LwPK) to enhance the generalization ability of the pre-trained model.
Experimental results indicate that LwPK effectively enhances the model resilience against catastrophic forgetting.
arXiv Detail & Related papers (2024-02-02T08:05:35Z) - Bias Mitigating Few-Shot Class-Incremental Learning [17.185744533050116]
Few-shot class-incremental learning aims at recognizing novel classes continually with limited novel class samples.
Recent methods somewhat alleviate the accuracy imbalance between base and incremental classes by fine-tuning the feature extractor in the incremental sessions.
We propose a novel method to mitigate model bias of the FSCIL problem during training and inference processes.
arXiv Detail & Related papers (2024-02-01T10:37:41Z) - Enhancing Consistency and Mitigating Bias: A Data Replay Approach for
Incremental Learning [100.7407460674153]
Deep learning systems are prone to catastrophic forgetting when learning from a sequence of tasks.
To mitigate the problem, a line of methods propose to replay the data of experienced tasks when learning new tasks.
However, it is not expected in practice considering the memory constraint or data privacy issue.
As a replacement, data-free data replay methods are proposed by inverting samples from the classification model.
arXiv Detail & Related papers (2024-01-12T12:51:12Z) - Compound Batch Normalization for Long-tailed Image Classification [77.42829178064807]
We propose a compound batch normalization method based on a Gaussian mixture.
It can model the feature space more comprehensively and reduce the dominance of head classes.
The proposed method outperforms existing methods on long-tailed image classification.
arXiv Detail & Related papers (2022-12-02T07:31:39Z) - CMW-Net: Learning a Class-Aware Sample Weighting Mapping for Robust Deep
Learning [55.733193075728096]
Modern deep neural networks can easily overfit to biased training data containing corrupted labels or class imbalance.
Sample re-weighting methods are popularly used to alleviate this data bias issue.
We propose a meta-model capable of adaptively learning an explicit weighting scheme directly from data.
arXiv Detail & Related papers (2022-02-11T13:49:51Z) - A Comparative Study of Calibration Methods for Imbalanced Class
Incremental Learning [10.680349952226935]
We study the problem of learning incrementally from imbalanced datasets.
We use a bounded memory to store exemplars of old classes across incremental states.
We show that simpler vanilla fine tuning is a stronger backbone for imbalanced incremental learning algorithms.
arXiv Detail & Related papers (2022-02-01T12:56:17Z) - Relieving Long-tailed Instance Segmentation via Pairwise Class Balance [85.53585498649252]
Long-tailed instance segmentation is a challenging task due to the extreme imbalance of training samples among classes.
It causes severe biases of the head classes (with majority samples) against the tailed ones.
We propose a novel Pairwise Class Balance (PCB) method, built upon a confusion matrix which is updated during training to accumulate the ongoing prediction preferences.
arXiv Detail & Related papers (2022-01-08T07:48:36Z) - You Only Need End-to-End Training for Long-Tailed Recognition [8.789819609485225]
Cross-entropy loss tends to produce highly correlated features on imbalanced data.
We propose two novel modules, Block-based Relatively Balanced Batch Sampler (B3RS) and Batch Embedded Training (BET)
Experimental results on the long-tailed classification benchmarks, CIFAR-LT and ImageNet-LT, demonstrate the effectiveness of our method.
arXiv Detail & Related papers (2021-12-11T11:44:09Z) - Improving Calibration for Long-Tailed Recognition [68.32848696795519]
We propose two methods to improve calibration and performance in such scenarios.
For dataset bias due to different samplers, we propose shifted batch normalization.
Our proposed methods set new records on multiple popular long-tailed recognition benchmark datasets.
arXiv Detail & Related papers (2021-04-01T13:55:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.