Related papers: Adaptive Additive Parameter Updates of Vision Transformers for Few-Shot Continual Learning

Adaptive Additive Parameter Updates of Vision Transformers for Few-Shot Continual Learning

URL: http://arxiv.org/abs/2504.08982v1
Date: Fri, 11 Apr 2025 21:17:30 GMT
Title: Adaptive Additive Parameter Updates of Vision Transformers for Few-Shot Continual Learning
Authors: Kyle Stein, Andrew Arash Mahyari, Guillermo Francia III, Eman El-Sheikh,
Abstract summary: Few-shot class incremental learning (FSCIL) addresses this by first training a model on a robust dataset of base classes and then incrementally adapting it in successive sessions.<n>This approach is prone to overfitting on the limited new data, which can compromise overall performance and exacerbate forgetting.<n>We propose a novel FSCIL framework that leverages a frozen Vision Transformer (ViT) backbone augmented with parameter-efficient additive updates.
Score: 0.0
License: http://creativecommons.org/publicdomain/zero/1.0/
Abstract: Integrating new class information without losing previously acquired knowledge remains a central challenge in artificial intelligence, often referred to as catastrophic forgetting. Few-shot class incremental learning (FSCIL) addresses this by first training a model on a robust dataset of base classes and then incrementally adapting it in successive sessions using only a few labeled examples per novel class. However, this approach is prone to overfitting on the limited new data, which can compromise overall performance and exacerbate forgetting. In this work, we propose a simple yet effective novel FSCIL framework that leverages a frozen Vision Transformer (ViT) backbone augmented with parameter-efficient additive updates. Our approach freezes the pre-trained ViT parameters and selectively injects trainable weights into the self-attention modules via an additive update mechanism. This design updates only a small subset of parameters to accommodate new classes without sacrificing the representations learned during the base session. By fine-tuning a limited number of parameters, our method preserves the generalizable features in the frozen ViT while reducing the risk of overfitting. Furthermore, as most parameters remain fixed, the model avoids overwriting previously learned knowledge when small novel data batches are introduced. Extensive experiments on benchmark datasets demonstrate that our approach yields state-of-the-art performance compared to baseline FSCIL methods.

Related papers

Sculpting Subspaces: Constrained Full Fine-Tuning in LLMs for Continual Learning [19.27175827358111]
Continual learning in large language models (LLMs) is prone to catastrophic forgetting, where adapting to new tasks significantly degrades performance on previously learned ones.<n>We propose a novel continual full fine-tuning approach leveraging adaptive singular value decomposition (SVD)<n>We evaluate our approach extensively on standard continual learning benchmarks using both encoder-decoder (T5-Large) and decoder-only (LLaMA-2 7B) models.
arXiv Detail & Related papers (2025-04-09T17:59:42Z)
ALoRE: Efficient Visual Adaptation via Aggregating Low Rank Experts [71.91042186338163]
ALoRE is a novel PETL method that reuses the hypercomplex parameterized space constructed by Kronecker product to Aggregate Low Rank Experts.<n>Thanks to the artful design, ALoRE maintains negligible extra parameters and can be effortlessly merged into the frozen backbone.
arXiv Detail & Related papers (2024-12-11T12:31:30Z)
SAFE: Slow and Fast Parameter-Efficient Tuning for Continual Learning with Pre-Trained Models [26.484208658326857]
Continual learning aims to incrementally acquire new concepts in data streams while resisting forgetting previous knowledge. With the rise of powerful pre-trained models (PTMs), there is a growing interest in training incremental learning systems.
arXiv Detail & Related papers (2024-11-04T15:34:30Z)
SMILE: Zero-Shot Sparse Mixture of Low-Rank Experts Construction From Pre-Trained Foundation Models [85.67096251281191]
We present an innovative approach to model fusion called zero-shot Sparse MIxture of Low-rank Experts (SMILE) construction. SMILE allows for the upscaling of source models into an MoE model without extra data or further training. We conduct extensive experiments across diverse scenarios, such as image classification and text generation tasks, using full fine-tuning and LoRA fine-tuning.
arXiv Detail & Related papers (2024-08-19T17:32:15Z)
Parameter-Efficient and Memory-Efficient Tuning for Vision Transformer: A Disentangled Approach [87.8330887605381]
We show how to adapt a pre-trained Vision Transformer to downstream recognition tasks with only a few learnable parameters. We synthesize a task-specific query with a learnable and lightweight module, which is independent of the pre-trained model. Our method achieves state-of-the-art performance under memory constraints, showcasing its applicability in real-world situations.
arXiv Detail & Related papers (2024-07-09T15:45:04Z)
Parameter-Selective Continual Test-Time Adaptation [3.480626767752489]
Continual Test-Time Adaptation (CTTA) aims to adapt a pretrained model to ever-changing environments during the test time under continuous domain shifts. PSMT method is capable of effectively updating the critical parameters within the MT network under domain shifts.
arXiv Detail & Related papers (2024-07-02T13:18:15Z)
Low-Rank Rescaled Vision Transformer Fine-Tuning: A Residual Design Approach [17.678759882763078]
Fine-tuning for pre-trained Vision Transformers aims to adeptly tailor a model to downstream tasks. Striking a balance between retaining the generalizable representation capacity of the pre-trained model and acquiring task-specific features is a key challenge. We propose a Residual-based Low-Rank Rescaling (RLRR) fine-tuning strategy.
arXiv Detail & Related papers (2024-03-28T00:14:53Z)
Strong Baselines for Parameter Efficient Few-Shot Fine-tuning [50.83426196335385]
Few-shot classification (FSC) entails learning novel classes given only a few examples per class after a pre-training (or meta-training) phase. Recent works have shown that simply fine-tuning a pre-trained Vision Transformer (ViT) on new test classes is a strong approach for FSC. Fine-tuning ViTs, however, is expensive in time, compute and storage. This has motivated the design of parameter efficient fine-tuning (PEFT) methods which fine-tune only a fraction of the Transformer's parameters.
arXiv Detail & Related papers (2023-04-04T16:14:39Z)
New Insights on Reducing Abrupt Representation Change in Online Continual Learning [69.05515249097208]
We focus on the change in representations of observed data that arises when previously unobserved classes appear in the incoming data stream. We show that applying Experience Replay causes the newly added classes' representations to overlap significantly with the previous classes. We propose a new method which mitigates this issue by shielding the learned representations from drastic adaptation to accommodate new classes.
arXiv Detail & Related papers (2022-03-08T01:37:00Z)
Hyperparameter-free Continuous Learning for Domain Classification in Natural Language Understanding [60.226644697970116]
Domain classification is the fundamental task in natural language understanding (NLU) Most existing continual learning approaches suffer from low accuracy and performance fluctuation. We propose a hyper parameter-free continual learning model for text data that can stably produce high performance under various environments.
arXiv Detail & Related papers (2022-01-05T02:46:16Z)
ZS-IL: Looking Back on Learned ExperiencesFor Zero-Shot Incremental Learning [9.530976792843495]
We propose an on-call transfer set to provide past experiences whenever a new class arises in the data stream. ZS-IL demonstrates significantly better performance on the well-known datasets (CIFAR-10, Tiny-ImageNet) in both Task-IL and Class-IL settings.
arXiv Detail & Related papers (2021-03-22T22:43:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.