Dynamic Prompt Adjustment for Multi-Label Class-Incremental Learning
- URL: http://arxiv.org/abs/2501.00340v2
- Date: Fri, 03 Jan 2025 03:22:32 GMT
- Title: Dynamic Prompt Adjustment for Multi-Label Class-Incremental Learning
- Authors: Haifeng Zhao, Yuguang Jin, Leilei Ma,
- Abstract summary: Visual language models such as CLIP have achieved good results in classification tasks.
We integrate an improved data replay mechanism and prompt loss to curb knowledge forgetting.
Our method has substantially improved the performance of MLCIL tasks across multiple benchmark datasets.
- Score: 0.13654846342364302
- License:
- Abstract: Significant advancements have been made in single label incremental learning (SLCIL),yet the more practical and challenging multi label class incremental learning (MLCIL) remains understudied. Recently,visual language models such as CLIP have achieved good results in classification tasks. However,directly using CLIP to solve MLCIL issue can lead to catastrophic forgetting. To tackle this issue, we integrate an improved data replay mechanism and prompt loss to curb knowledge forgetting. Specifically,our model enhances the prompt information to better adapt to multi-label classification tasks and employs confidence-based replay strategy to select representative samples. Moreover, the prompt loss significantly reduces the model's forgetting of previous knowledge. Experimental results demonstrate that our method has substantially improved the performance of MLCIL tasks across multiple benchmark datasets,validating its effectiveness.
Related papers
- Revitalizing Reconstruction Models for Multi-class Anomaly Detection via Class-Aware Contrastive Learning [19.114941437668705]
We propose a plug-and-play modification by incorporating class-aware contrastive learning (CL)
Experiments across four datasets verify the effectiveness of our approach, yielding significant improvements and superior performance compared to advanced methods.
arXiv Detail & Related papers (2024-12-06T04:31:09Z) - Words Matter: Leveraging Individual Text Embeddings for Code Generation in CLIP Test-Time Adaptation [21.20806568508201]
We show how to leverage class text information to mitigate distribution drifts encountered by vision-language models (VLMs) during test-time inference.
We propose to generate pseudo-labels for the test-time samples by exploiting generic class text embeddings as fixed centroids of a label assignment problem.
Experiments on multiple popular test-time adaptation benchmarks presenting diverse complexity empirically show the superiority of CLIP-OT.
arXiv Detail & Related papers (2024-11-26T00:15:37Z) - Learning with Less: Knowledge Distillation from Large Language Models via Unlabeled Data [54.934578742209716]
In real-world NLP applications, Large Language Models (LLMs) offer promising solutions due to their extensive training on vast datasets.
LLKD is an adaptive sample selection method that incorporates signals from both the teacher and student.
Our comprehensive experiments show that LLKD achieves superior performance across various datasets with higher data efficiency.
arXiv Detail & Related papers (2024-11-12T18:57:59Z) - SLCA++: Unleash the Power of Sequential Fine-tuning for Continual Learning with Pre-training [68.7896349660824]
We present an in-depth analysis of the progressive overfitting problem from the lens of Seq FT.
Considering that the overly fast representation learning and the biased classification layer constitute this particular problem, we introduce the advanced Slow Learner with Alignment (S++) framework.
Our approach involves a Slow Learner to selectively reduce the learning rate of backbone parameters, and a Alignment to align the disjoint classification layers in a post-hoc fashion.
arXiv Detail & Related papers (2024-08-15T17:50:07Z) - Unlearn What You Want to Forget: Efficient Unlearning for LLMs [92.51670143929056]
Large language models (LLMs) have achieved significant progress from pre-training on and memorizing a wide range of textual data.
This process might suffer from privacy issues and violations of data protection regulations.
We propose an efficient unlearning framework that could efficiently update LLMs without having to retrain the whole model after data removals.
arXiv Detail & Related papers (2023-10-31T03:35:59Z) - To Repeat or Not To Repeat: Insights from Scaling LLM under Token-Crisis [50.31589712761807]
Large language models (LLMs) are notoriously token-hungry during pre-training, and high-quality text data on the web is approaching its scaling limit for LLMs.
We investigate the consequences of repeating pre-training data, revealing that the model is susceptible to overfitting.
Second, we examine the key factors contributing to multi-epoch degradation, finding that significant factors include dataset size, model parameters, and training objectives.
arXiv Detail & Related papers (2023-05-22T17:02:15Z) - SLCA: Slow Learner with Classifier Alignment for Continual Learning on a
Pre-trained Model [73.80068155830708]
We present an extensive analysis for continual learning on a pre-trained model (CLPM)
We propose a simple but extremely effective approach named Slow Learner with Alignment (SLCA)
Across a variety of scenarios, our proposal provides substantial improvements for CLPM.
arXiv Detail & Related papers (2023-03-09T08:57:01Z) - Multimodal Parameter-Efficient Few-Shot Class Incremental Learning [1.9220716793379256]
Few-Shot Class Incremental Learning (FSCIL) is a challenging continual learning task, where limited training examples are available during several learning sessions.
To succeed in this task, it is necessary to avoid over-fitting new classes caused by biased distributions in the few-shot training sets.
CPE-CLIP significantly improves FSCIL performance compared to state-of-the-art proposals while also drastically reducing the number of learnable parameters and training costs.
arXiv Detail & Related papers (2023-03-08T17:34:15Z) - Knowledge Restore and Transfer for Multi-label Class-Incremental
Learning [34.378828633726854]
We propose a knowledge restore and transfer (KRT) framework for multi-label class-incremental learning (MLCIL)
KRT includes a dynamic pseudo-label (DPL) module to restore the old class knowledge and an incremental cross-attention(ICA) module to save session-specific knowledge and transfer old class knowledge to the new model sufficiently.
Experimental results on MS-COCO and PASCAL VOC datasets demonstrate the effectiveness of our method for improving recognition performance and mitigating forgetting.
arXiv Detail & Related papers (2023-02-26T15:34:05Z) - Mitigating Forgetting in Online Continual Learning via Contrasting
Semantically Distinct Augmentations [22.289830907729705]
Online continual learning (OCL) aims to enable model learning from a non-stationary data stream to continuously acquire new knowledge as well as retain the learnt one.
Main challenge comes from the "catastrophic forgetting" issue -- the inability to well remember the learnt knowledge while learning the new ones.
arXiv Detail & Related papers (2022-11-10T05:29:43Z) - Self-Supervised Class Incremental Learning [51.62542103481908]
Existing Class Incremental Learning (CIL) methods are based on a supervised classification framework sensitive to data labels.
When updating them based on the new class data, they suffer from catastrophic forgetting: the model cannot discern old class data clearly from the new.
In this paper, we explore the performance of Self-Supervised representation learning in Class Incremental Learning (SSCIL) for the first time.
arXiv Detail & Related papers (2021-11-18T06:58:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.