Information-Theoretic Complementary Prompts for Improved Continual Text Classification
- URL: http://arxiv.org/abs/2505.20933v1
- Date: Tue, 27 May 2025 09:22:14 GMT
- Title: Information-Theoretic Complementary Prompts for Improved Continual Text Classification
- Authors: Duzhen Zhang, Yong Ren, Chenxing Li, Dong Yu, Tielin Zhang,
- Abstract summary: We introduce Information-Theoretic Complementary Prompts (InfoComp) for Continual Text Classification.<n>InfoComp explicitly learns two distinct prompt spaces: P(rivate)-Prompt and S(hared)-Prompt.<n>Within this framework, we design two novel loss functions: (1) to strengthen the accumulation of task-specific knowledge in P-Prompt, effectively mitigating catastrophic forgetting, and (2) to enhance the retention of task-invariant knowledge in S-Prompt, improving forward knowledge transfer.
- Score: 34.30189210224955
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Continual Text Classification (CTC) aims to continuously classify new text data over time while minimizing catastrophic forgetting of previously acquired knowledge. However, existing methods often focus on task-specific knowledge, overlooking the importance of shared, task-agnostic knowledge. Inspired by the complementary learning systems theory, which posits that humans learn continually through the interaction of two systems -- the hippocampus, responsible for forming distinct representations of specific experiences, and the neocortex, which extracts more general and transferable representations from past experiences -- we introduce Information-Theoretic Complementary Prompts (InfoComp), a novel approach for CTC. InfoComp explicitly learns two distinct prompt spaces: P(rivate)-Prompt and S(hared)-Prompt. These respectively encode task-specific and task-invariant knowledge, enabling models to sequentially learn classification tasks without relying on data replay. To promote more informative prompt learning, InfoComp uses an information-theoretic framework that maximizes mutual information between different parameters (or encoded representations). Within this framework, we design two novel loss functions: (1) to strengthen the accumulation of task-specific knowledge in P-Prompt, effectively mitigating catastrophic forgetting, and (2) to enhance the retention of task-invariant knowledge in S-Prompt, improving forward knowledge transfer. Extensive experiments on diverse CTC benchmarks show that our approach outperforms previous state-of-the-art methods.
Related papers
- DUKAE: DUal-level Knowledge Accumulation and Ensemble for Pre-Trained Model-Based Continual Learning [19.684132921720945]
Pre-trained model-based continual learning (PTMCL) has garnered growing attention, as it enables more rapid acquisition of new knowledge.<n>We propose a method named DUal-level Knowledge Accumulation and Ensemble (DUKAE) that leverages both feature-level and decision-level knowledge accumulation.<n>Experiments on CIFAR-100, ImageNet-R, CUB-200, and Cars-196 datasets demonstrate the superior performance of our approach.
arXiv Detail & Related papers (2025-04-09T01:40:38Z) - CSTA: Spatial-Temporal Causal Adaptive Learning for Exemplar-Free Video Class-Incremental Learning [62.69917996026769]
A class-incremental learning task requires learning and preserving both spatial appearance and temporal action involvement.<n>We propose a framework that equips separate adapters to learn new class patterns, accommodating the incremental information requirements unique to each class.<n>A causal compensation mechanism is proposed to reduce the conflicts during increment and memorization for between different types of information.
arXiv Detail & Related papers (2025-01-13T11:34:55Z) - Hierarchical Prompts for Rehearsal-free Continual Learning [67.37739666753008]
Continual learning endeavors to equip the model with the capability to integrate current task knowledge while mitigating the forgetting of past task knowledge.
Inspired by prompt tuning, prompt-based methods maintain a frozen backbone and train with slight learnable prompts.
This paper introduces a novel rehearsal-free paradigm for continual learning termed Hierarchical Prompts (H-Prompts)
arXiv Detail & Related papers (2024-01-21T16:59:44Z) - Prompt Learning With Knowledge Memorizing Prototypes For Generalized
Few-Shot Intent Detection [22.653220906899612]
Generalized Few-Shot Intent Detection (GFSID) is challenging and realistic because it needs to categorize both seen and novel intents simultaneously.
Previous GFSID methods rely on the episodic learning paradigm.
We propose to convert the GFSID task into the class incremental learning paradigm.
arXiv Detail & Related papers (2023-09-10T09:16:38Z) - Pre-training Multi-task Contrastive Learning Models for Scientific
Literature Understanding [52.723297744257536]
Pre-trained language models (LMs) have shown effectiveness in scientific literature understanding tasks.
We propose a multi-task contrastive learning framework, SciMult, to facilitate common knowledge sharing across different literature understanding tasks.
arXiv Detail & Related papers (2023-05-23T16:47:22Z) - Dual Semantic Knowledge Composed Multimodal Dialog Systems [114.52730430047589]
We propose a novel multimodal task-oriented dialog system named MDS-S2.
It acquires the context related attribute and relation knowledge from the knowledge base.
We also devise a set of latent query variables to distill the semantic information from the composed response representation.
arXiv Detail & Related papers (2023-05-17T06:33:26Z) - Online Continual Learning via the Knowledge Invariant and Spread-out
Properties [4.109784267309124]
Key challenge in continual learning is catastrophic forgetting.
We propose a new method, named Online Continual Learning via the Knowledge Invariant and Spread-out Properties (OCLKISP)
We empirically evaluate our proposed method on four popular benchmarks for continual learning: Split CIFAR 100, Split SVHN, Split CUB200 and Split Tiny-Image-Net.
arXiv Detail & Related papers (2023-02-02T04:03:38Z) - Continual Prompt Tuning for Dialog State Tracking [58.66412648276873]
A desirable dialog system should be able to continually learn new skills without forgetting old ones.
We present Continual Prompt Tuning, a parameter-efficient framework that not only avoids forgetting but also enables knowledge transfer between tasks.
arXiv Detail & Related papers (2022-03-13T13:22:41Z) - Understand me, if you refer to Aspect Knowledge: Knowledge-aware Gated
Recurrent Memory Network [54.735400754548635]
Aspect-level sentiment classification (ASC) aims to predict the fine-grained sentiment polarity towards a given aspect mentioned in a review.
Despite recent advances in ASC, enabling machines to preciously infer aspect sentiments is still challenging.
This paper tackles two challenges in ASC: (1) due to lack of aspect knowledge, aspect representation is inadequate to represent aspect's exact meaning and property information; (2) prior works only capture either local syntactic information or global relational information, thus missing either one of them leads to insufficient syntactic information.
arXiv Detail & Related papers (2021-08-05T03:39:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.