PromptFusion: Decoupling Stability and Plasticity for Continual Learning
- URL: http://arxiv.org/abs/2303.07223v2
- Date: Wed, 10 Jul 2024 08:23:20 GMT
- Title: PromptFusion: Decoupling Stability and Plasticity for Continual Learning
- Authors: Haoran Chen, Zuxuan Wu, Xintong Han, Menglin Jia, Yu-Gang Jiang,
- Abstract summary: We propose a prompt-tuning-based method termed PromptFusion to enable the decoupling of stability and plasticity.
Specifically, PromptFusion consists of a carefully designed stab module that deals with catastrophic forgetting and a boo module to learn new knowledge concurrently.
Experiments show that both PromptFusion and PromptFusion-Lite achieve promising results on popular continual learning datasets for class-incremental and domain-incremental settings.
- Score: 83.68586386842105
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Current research on continual learning mainly focuses on relieving catastrophic forgetting, and most of their success is at the cost of limiting the performance of newly incoming tasks. Such a trade-off is referred to as the stability-plasticity dilemma and is a more general and challenging problem for continual learning. However, the inherent conflict between these two concepts makes it seemingly impossible to devise a satisfactory solution to both of them simultaneously. Therefore, we ask, "is it possible to divide them into two separate problems to conquer them independently?". To this end, we propose a prompt-tuning-based method termed PromptFusion to enable the decoupling of stability and plasticity. Specifically, PromptFusion consists of a carefully designed \stab module that deals with catastrophic forgetting and a \boo module to learn new knowledge concurrently. Furthermore, to address the computational overhead brought by the additional architecture, we propose PromptFusion-Lite which improves PromptFusion by dynamically determining whether to activate both modules for each input image. Extensive experiments show that both PromptFusion and PromptFusion-Lite achieve promising results on popular continual learning datasets for class-incremental and domain-incremental settings. Especially on Split-Imagenet-R, one of the most challenging datasets for class-incremental learning, our method can exceed state-of-the-art prompt-based methods by more than 5\% in accuracy, with PromptFusion-Lite using 14.8\% less computational resources than PromptFusion.
Related papers
- ControlFusion: A Controllable Image Fusion Framework with Language-Vision Degradation Prompts [58.99648692413168]
Current image fusion methods struggle to address the composite degradations encountered in real-world imaging scenarios.
We propose ControlFusion, which adaptively neutralizes composite degradations.
In experiments, ControlFusion outperforms SOTA fusion methods in fusion quality and degradation handling.
arXiv Detail & Related papers (2025-03-30T08:18:53Z) - MathFusion: Enhancing Mathematic Problem-solving of LLM through Instruction Fusion [48.443460251524776]
MathFusion is a novel framework that enhances mathematical reasoning through cross-problem instruction synthesis.
MathFusion achieves substantial improvements in mathematical reasoning while maintaining high data efficiency.
arXiv Detail & Related papers (2025-03-20T15:00:41Z) - TinyFusion: Diffusion Transformers Learned Shallow [52.96232442322824]
Diffusion Transformers have demonstrated remarkable capabilities in image generation but often come with excessive parameterization.
We present TinyFusion, a depth pruning method designed to remove redundant layers from diffusion transformers via end-to-end learning.
Experiments with DiT-XL show that TinyFusion can craft a shallow diffusion transformer at less than 7% of the pre-training cost, achieving a 2$times$ speedup with an FID score of 2.86.
arXiv Detail & Related papers (2024-12-02T07:05:39Z) - DESIRE: Dynamic Knowledge Consolidation for Rehearsal-Free Continual Learning [23.878495627964146]
Continual learning aims to equip models with the ability to retain previously learned knowledge like a human.
Existing methods usually overlook the issue of information leakage caused by the fact that the experiment data have been used in pre-trained models.
In this paper, we propose a new LoRA-based rehearsal-free method named DESIRE.
arXiv Detail & Related papers (2024-11-28T13:54:01Z) - Fusion Matters: Learning Fusion in Deep Click-through Rate Prediction Models [27.477136474888564]
We introduce OptFusion, a method that automates the learning of fusion, encompassing both the connection learning and the operation selection.
Our experiments are conducted over three large-scale datasets.
arXiv Detail & Related papers (2024-11-24T06:21:59Z) - SHERL: Synthesizing High Accuracy and Efficient Memory for Resource-Limited Transfer Learning [63.93193829913252]
We propose an innovative METL strategy called SHERL for resource-limited scenarios.
In the early route, intermediate outputs are consolidated via an anti-redundancy operation.
In the late route, utilizing minimal late pre-trained layers could alleviate the peak demand on memory overhead.
arXiv Detail & Related papers (2024-07-10T10:22:35Z) - A streamlined Approach to Multimodal Few-Shot Class Incremental Learning
for Fine-Grained Datasets [23.005760505169803]
Few-shot Class-Incremental Learning (FSCIL) poses the challenge of retaining prior knowledge while learning from limited new data streams.
We propose Session-Specific Prompts (SSP), which enhances the separability of image-text embeddings across sessions.
The second, Hyperbolic distance, compresses representations of image-text pairs within the same class while expanding those from different classes, leading to better representations.
arXiv Detail & Related papers (2024-03-10T19:50:03Z) - ReFusion: Improving Natural Language Understanding with Computation-Efficient Retrieval Representation Fusion [22.164620956284466]
Retrieval-based augmentations (RA) incorporating knowledge from an external database into language models have greatly succeeded in various knowledge-intensive (KI) tasks.
Existing works focus on concatenating retrievals with inputs to improve model performance.
This paper proposes a new paradigm of RA named textbfReFusion, a computation-efficient Retrieval representation Fusion with bi-level optimization.
arXiv Detail & Related papers (2024-01-04T07:39:26Z) - ReFusion: Learning Image Fusion from Reconstruction with Learnable Loss
via Meta-Learning [17.91346343984845]
We introduce a unified image fusion framework based on meta-learning, named ReFusion.
ReFusion employs a parameterized loss function, dynamically adjusted by the training framework according to the specific scenario and task.
It is capable of adapting to various tasks, including infrared-visible, medical, multi-focus, and multi-exposure image fusion.
arXiv Detail & Related papers (2023-12-13T07:40:39Z) - Continual Learning through Networks Splitting and Merging with
Dreaming-Meta-Weighted Model Fusion [20.74264925323055]
It's challenging to balance the networks stability and plasticity in continual learning scenarios.
We propose Split2MetaFusion which can achieve better trade-off by employing a two-stage strategy.
arXiv Detail & Related papers (2023-12-12T09:02:56Z) - Bi-level Dynamic Learning for Jointly Multi-modality Image Fusion and
Beyond [50.556961575275345]
We build an image fusion module to fuse complementary characteristics and cascade dual task-related modules.
We develop an efficient first-order approximation to compute corresponding gradients and present dynamic weighted aggregation to balance the gradients for fusion learning.
arXiv Detail & Related papers (2023-05-11T10:55:34Z) - Efficient Multimodal Fusion via Interactive Prompting [62.08292938484994]
Large-scale pre-training has brought unimodal fields such as computer vision and natural language processing to a new era.
We propose an efficient and flexible multimodal fusion method, namely PMF, tailored for fusing unimodally pre-trained transformers.
arXiv Detail & Related papers (2023-04-13T07:31:51Z) - Distilling a Powerful Student Model via Online Knowledge Distillation [158.68873654990895]
Existing online knowledge distillation approaches either adopt the student with the best performance or construct an ensemble model for better holistic performance.
We propose a novel method for online knowledge distillation, termed FFSD, which comprises two key components: Feature Fusion and Self-Distillation.
arXiv Detail & Related papers (2021-03-26T13:54:24Z) - $P^2$ Net: Augmented Parallel-Pyramid Net for Attention Guided Pose
Estimation [69.25492391672064]
We propose an augmented Parallel-Pyramid Net with feature refinement by dilated bottleneck and attention module.
A parallel-pyramid structure is followed to compensate the information loss introduced by the network.
Our method achieves the best performance on the challenging MSCOCO and MPII datasets.
arXiv Detail & Related papers (2020-10-26T02:10:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.