InfoCL: Alleviating Catastrophic Forgetting in Continual Text
Classification from An Information Theoretic Perspective
- URL: http://arxiv.org/abs/2310.06362v1
- Date: Tue, 10 Oct 2023 07:00:13 GMT
- Title: InfoCL: Alleviating Catastrophic Forgetting in Continual Text
Classification from An Information Theoretic Perspective
- Authors: Yifan Song, Peiyi Wang, Weimin Xiong, Dawei Zhu, Tianyu Liu, Zhifang
Sui, Sujian Li
- Abstract summary: We focus on continual text classification under the class-incremental setting.
Recent studies have identified the severe performance decrease on analogous classes as a key factor for forgetting.
We propose a novel replay-based continual text classification method, InfoCL.
- Score: 44.961805748830066
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Continual learning (CL) aims to constantly learn new knowledge over time
while avoiding catastrophic forgetting on old tasks. We focus on continual text
classification under the class-incremental setting. Recent CL studies have
identified the severe performance decrease on analogous classes as a key factor
for catastrophic forgetting. In this paper, through an in-depth exploration of
the representation learning process in CL, we discover that the compression
effect of the information bottleneck leads to confusion on analogous classes.
To enable the model learn more sufficient representations, we propose a novel
replay-based continual text classification method, InfoCL. Our approach
utilizes fast-slow and current-past contrastive learning to perform mutual
information maximization and better recover the previously learned
representations. In addition, InfoCL incorporates an adversarial memory
augmentation strategy to alleviate the overfitting problem of replay.
Experimental results demonstrate that InfoCL effectively mitigates forgetting
and achieves state-of-the-art performance on three text classification tasks.
The code is publicly available at https://github.com/Yifan-Song793/InfoCL.
Related papers
- TS-ACL: A Time Series Analytic Continual Learning Framework for Privacy-Preserving and Class-Incremental Pattern Recognition [14.108911377558242]
TS-ACL is a novel framework for privacy-preserving and class-incremental pattern recognition.
It transforms each update of the model into a gradient-free analytical learning process with a closed-form solution.
It simultaneously achieves non-forgetting, privacy preservation, and lightweight consumption, making it widely suitable for various applications.
arXiv Detail & Related papers (2024-10-21T12:34:02Z) - ECLIPSE: Efficient Continual Learning in Panoptic Segmentation with Visual Prompt Tuning [54.68180752416519]
Panoptic segmentation is a cutting-edge computer vision task.
We introduce a novel and efficient method for continual panoptic segmentation based on Visual Prompt Tuning, dubbed ECLIPSE.
Our approach involves freezing the base model parameters and fine-tuning only a small set of prompt embeddings, addressing both catastrophic forgetting and plasticity.
arXiv Detail & Related papers (2024-03-29T11:31:12Z) - AttriCLIP: A Non-Incremental Learner for Incremental Knowledge Learning [53.32576252950481]
Continual learning aims to enable a model to incrementally learn knowledge from sequentially arrived data.
In this paper, we propose a non-incremental learner, named AttriCLIP, to incrementally extract knowledge of new classes or tasks.
arXiv Detail & Related papers (2023-05-19T07:39:17Z) - RepCL: Exploring Effective Representation for Continual Text
Classification [34.33543812253366]
We focus on continual text classification under the class-incremental setting.
Recent CL studies find that the representations learned in one task may not be effective for other tasks.
We propose a novel replay-based continual text classification method, RepCL.
arXiv Detail & Related papers (2023-05-12T07:32:00Z) - Adversarial Training with Complementary Labels: On the Benefit of
Gradually Informative Attacks [119.38992029332883]
Adversarial training with imperfect supervision is significant but receives limited attention.
We propose a new learning strategy using gradually informative attacks.
Experiments are conducted to demonstrate the effectiveness of our method on a range of benchmarked datasets.
arXiv Detail & Related papers (2022-11-01T04:26:45Z) - Beyond Supervised Continual Learning: a Review [69.9674326582747]
Continual Learning (CL) is a flavor of machine learning where the usual assumption of stationary data distribution is relaxed or omitted.
Changes in the data distribution can cause the so-called catastrophic forgetting (CF) effect: an abrupt loss of previous knowledge.
This article reviews literature that study CL in other settings, such as learning with reduced supervision, fully unsupervised learning, and reinforcement learning.
arXiv Detail & Related papers (2022-08-30T14:44:41Z) - Don't Stop Learning: Towards Continual Learning for the CLIP Model [21.212839450030838]
The Contrastive Language-Image Pre-training (CLIP) Model is a recently proposed large-scale pre-train model.
This work conducts a systemic study on the continual learning issue of the CLIP model.
We propose a new algorithm, dubbed Learning without Forgetting via Replayed Vocabulary (VR-LwF), which shows exact effectiveness for alleviating the forgetting issue of the CLIP model.
arXiv Detail & Related papers (2022-07-19T13:03:14Z) - Self-Supervised Class Incremental Learning [51.62542103481908]
Existing Class Incremental Learning (CIL) methods are based on a supervised classification framework sensitive to data labels.
When updating them based on the new class data, they suffer from catastrophic forgetting: the model cannot discern old class data clearly from the new.
In this paper, we explore the performance of Self-Supervised representation learning in Class Incremental Learning (SSCIL) for the first time.
arXiv Detail & Related papers (2021-11-18T06:58:19Z) - Prototypes-Guided Memory Replay for Continual Learning [13.459792148030717]
Continual learning (CL) refers to a machine learning paradigm that using only a small account of training samples and previously learned knowledge to enhance learning performance.
The major difficulty in CL is catastrophic forgetting of previously learned tasks, caused by shifts in data distributions.
We propose a memory-efficient CL method, incorporating it into an online meta-learning model.
arXiv Detail & Related papers (2021-08-28T13:00:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.