Related papers: PromptFusion: Decoupling Stability and Plasticity for Continual Learning

PromptFusion: Decoupling Stability and Plasticity for Continual Learning

URL: http://arxiv.org/abs/2303.07223v2
Date: Wed, 10 Jul 2024 08:23:20 GMT
Title: PromptFusion: Decoupling Stability and Plasticity for Continual Learning
Authors: Haoran Chen, Zuxuan Wu, Xintong Han, Menglin Jia, Yu-Gang Jiang,
Abstract summary: We propose a prompt-tuning-based method termed PromptFusion to enable the decoupling of stability and plasticity. Specifically, PromptFusion consists of a carefully designed stab module that deals with catastrophic forgetting and a boo module to learn new knowledge concurrently. Experiments show that both PromptFusion and PromptFusion-Lite achieve promising results on popular continual learning datasets for class-incremental and domain-incremental settings.
Score: 83.68586386842105
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Current research on continual learning mainly focuses on relieving catastrophic forgetting, and most of their success is at the cost of limiting the performance of newly incoming tasks. Such a trade-off is referred to as the stability-plasticity dilemma and is a more general and challenging problem for continual learning. However, the inherent conflict between these two concepts makes it seemingly impossible to devise a satisfactory solution to both of them simultaneously. Therefore, we ask, "is it possible to divide them into two separate problems to conquer them independently?". To this end, we propose a prompt-tuning-based method termed PromptFusion to enable the decoupling of stability and plasticity. Specifically, PromptFusion consists of a carefully designed \stab module that deals with catastrophic forgetting and a \boo module to learn new knowledge concurrently. Furthermore, to address the computational overhead brought by the additional architecture, we propose PromptFusion-Lite which improves PromptFusion by dynamically determining whether to activate both modules for each input image. Extensive experiments show that both PromptFusion and PromptFusion-Lite achieve promising results on popular continual learning datasets for class-incremental and domain-incremental settings. Especially on Split-Imagenet-R, one of the most challenging datasets for class-incremental learning, our method can exceed state-of-the-art prompt-based methods by more than 5\% in accuracy, with PromptFusion-Lite using 14.8\% less computational resources than PromptFusion.

Related papers

ControlFusion: A Controllable Image Fusion Framework with Language-Vision Degradation Prompts [58.99648692413168]
Current image fusion methods struggle to address the composite degradations encountered in real-world imaging scenarios. We propose ControlFusion, which adaptively neutralizes composite degradations. In experiments, ControlFusion outperforms SOTA fusion methods in fusion quality and degradation handling.
arXiv Detail & Related papers (2025-03-30T08:18:53Z)
MathFusion: Enhancing Mathematic Problem-solving of LLM through Instruction Fusion [48.443460251524776]
MathFusion is a novel framework that enhances mathematical reasoning through cross-problem instruction synthesis. MathFusion achieves substantial improvements in mathematical reasoning while maintaining high data efficiency.
arXiv Detail & Related papers (2025-03-20T15:00:41Z)
TinyFusion: Diffusion Transformers Learned Shallow [52.96232442322824]
Diffusion Transformers have demonstrated remarkable capabilities in image generation but often come with excessive parameterization. We present TinyFusion, a depth pruning method designed to remove redundant layers from diffusion transformers via end-to-end learning. Experiments with DiT-XL show that TinyFusion can craft a shallow diffusion transformer at less than 7% of the pre-training cost, achieving a 2$times$ speedup with an FID score of 2.86.
arXiv Detail & Related papers (2024-12-02T07:05:39Z)
DESIRE: Dynamic Knowledge Consolidation for Rehearsal-Free Continual Learning [23.878495627964146]
Continual learning aims to equip models with the ability to retain previously learned knowledge like a human. Existing methods usually overlook the issue of information leakage caused by the fact that the experiment data have been used in pre-trained models. In this paper, we propose a new LoRA-based rehearsal-free method named DESIRE.
arXiv Detail & Related papers (2024-11-28T13:54:01Z)
Fusion Matters: Learning Fusion in Deep Click-through Rate Prediction Models [27.477136474888564]
We introduce OptFusion, a method that automates the learning of fusion, encompassing both the connection learning and the operation selection. Our experiments are conducted over three large-scale datasets.
arXiv Detail & Related papers (2024-11-24T06:21:59Z)
SHERL: Synthesizing High Accuracy and Efficient Memory for Resource-Limited Transfer Learning [63.93193829913252]
We propose an innovative METL strategy called SHERL for resource-limited scenarios. In the early route, intermediate outputs are consolidated via an anti-redundancy operation. In the late route, utilizing minimal late pre-trained layers could alleviate the peak demand on memory overhead.
arXiv Detail & Related papers (2024-07-10T10:22:35Z)
A streamlined Approach to Multimodal Few-Shot Class Incremental Learning for Fine-Grained Datasets [23.005760505169803]
Few-shot Class-Incremental Learning (FSCIL) poses the challenge of retaining prior knowledge while learning from limited new data streams. We propose Session-Specific Prompts (SSP), which enhances the separability of image-text embeddings across sessions. The second, Hyperbolic distance, compresses representations of image-text pairs within the same class while expanding those from different classes, leading to better representations.
arXiv Detail & Related papers (2024-03-10T19:50:03Z)
ReFusion: Improving Natural Language Understanding with Computation-Efficient Retrieval Representation Fusion [22.164620956284466]
Retrieval-based augmentations (RA) incorporating knowledge from an external database into language models have greatly succeeded in various knowledge-intensive (KI) tasks. Existing works focus on concatenating retrievals with inputs to improve model performance. This paper proposes a new paradigm of RA named textbfReFusion, a computation-efficient Retrieval representation Fusion with bi-level optimization.
arXiv Detail & Related papers (2024-01-04T07:39:26Z)
ReFusion: Learning Image Fusion from Reconstruction with Learnable Loss via Meta-Learning [17.91346343984845]
We introduce a unified image fusion framework based on meta-learning, named ReFusion. ReFusion employs a parameterized loss function, dynamically adjusted by the training framework according to the specific scenario and task. It is capable of adapting to various tasks, including infrared-visible, medical, multi-focus, and multi-exposure image fusion.
arXiv Detail & Related papers (2023-12-13T07:40:39Z)
Continual Learning through Networks Splitting and Merging with Dreaming-Meta-Weighted Model Fusion [20.74264925323055]
It's challenging to balance the networks stability and plasticity in continual learning scenarios. We propose Split2MetaFusion which can achieve better trade-off by employing a two-stage strategy.
arXiv Detail & Related papers (2023-12-12T09:02:56Z)
Bi-level Dynamic Learning for Jointly Multi-modality Image Fusion and Beyond [50.556961575275345]
We build an image fusion module to fuse complementary characteristics and cascade dual task-related modules. We develop an efficient first-order approximation to compute corresponding gradients and present dynamic weighted aggregation to balance the gradients for fusion learning.
arXiv Detail & Related papers (2023-05-11T10:55:34Z)
Efficient Multimodal Fusion via Interactive Prompting [62.08292938484994]
Large-scale pre-training has brought unimodal fields such as computer vision and natural language processing to a new era. We propose an efficient and flexible multimodal fusion method, namely PMF, tailored for fusing unimodally pre-trained transformers.
arXiv Detail & Related papers (2023-04-13T07:31:51Z)
Distilling a Powerful Student Model via Online Knowledge Distillation [158.68873654990895]
Existing online knowledge distillation approaches either adopt the student with the best performance or construct an ensemble model for better holistic performance. We propose a novel method for online knowledge distillation, termed FFSD, which comprises two key components: Feature Fusion and Self-Distillation.
arXiv Detail & Related papers (2021-03-26T13:54:24Z)
$P^2$ Net: Augmented Parallel-Pyramid Net for Attention Guided Pose Estimation [69.25492391672064]
We propose an augmented Parallel-Pyramid Net with feature refinement by dilated bottleneck and attention module. A parallel-pyramid structure is followed to compensate the information loss introduced by the network. Our method achieves the best performance on the challenging MSCOCO and MPII datasets.
arXiv Detail & Related papers (2020-10-26T02:10:12Z)

This list is automatically generated from the titles and abstracts of the papers in this site.