Related papers: Mamba-FSCIL: Dynamic Adaptation with Selective State Space Model for Few-Shot Class-Incremental Learning

Mamba-FSCIL: Dynamic Adaptation with Selective State Space Model for Few-Shot Class-Incremental Learning

URL: http://arxiv.org/abs/2407.06136v1
Date: Mon, 8 Jul 2024 17:09:39 GMT
Title: Mamba-FSCIL: Dynamic Adaptation with Selective State Space Model for Few-Shot Class-Incremental Learning
Authors: Xiaojie Li, Yibo Yang, Jianlong Wu, Bernard Ghanem, Liqiang Nie, Min Zhang,
Abstract summary: Few-shot class-incremental learning (FSCIL) confronts the challenge of integrating new classes into a model with minimal training samples. We develop a class-sensitive selective scan mechanism to guide dynamic adaptation. Experiments on miniImageNet, CUB-200, and CIFAR-100 demonstrate that our framework outperforms the existing state-of-the-art methods.
Score: 113.89327264634984
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Few-shot class-incremental learning (FSCIL) confronts the challenge of integrating new classes into a model with minimal training samples while preserving the knowledge of previously learned classes. Traditional methods widely adopt static adaptation relying on a fixed parameter space to learn from data that arrive sequentially, prone to overfitting to the current session. Existing dynamic strategies require the expansion of the parameter space continually, leading to increased complexity. To address these challenges, we integrate the recently proposed selective state space model (SSM) into FSCIL. Concretely, we propose a dual selective SSM projector that dynamically adjusts the projection parameters based on the intermediate features for dynamic adaptation. The dual design enables the model to maintain the robust features of base classes, while adaptively learning distinctive feature shifts for novel classes. Additionally, we develop a class-sensitive selective scan mechanism to guide dynamic adaptation. It minimizes the disruption to base-class representations caused by training on novel data, and meanwhile, forces the selective scan to perform in distinct patterns between base and novel classes. Experiments on miniImageNet, CUB-200, and CIFAR-100 demonstrate that our framework outperforms the existing state-of-the-art methods. The code is available at https://github.com/xiaojieli0903/Mamba-FSCIL.

Related papers

Enhancing Online Continual Learning with Plug-and-Play State Space Model and Class-Conditional Mixture of Discretization [72.81319836138347]
Online continual learning (OCL) seeks to learn new tasks from data streams that appear only once, while retaining knowledge of previously learned tasks. Most existing methods rely on replay, focusing on enhancing memory retention through regularization or distillation. We introduce a plug-and-play module, S6MOD, which can be integrated into most existing methods and directly improve adaptability.
arXiv Detail & Related papers (2024-12-24T05:25:21Z)
UIFormer: A Unified Transformer-based Framework for Incremental Few-Shot Object Detection and Instance Segmentation [38.331860053615955]
This paper introduces a novel framework for unified incremental few-shot object detection (iFSOD) and instance segmentation (iFSIS) using the Transformer architecture. Our goal is to create an optimal solution for situations where only a few examples of novel object classes are available.
arXiv Detail & Related papers (2024-11-13T12:29:44Z)
Collaborative Static-Dynamic Teaching: A Semi-Supervised Framework for Stripe-Like Space Target Detection [2.9133687889451023]
Stripe-like space target detection is crucial for space situational awareness. Traditional unsupervised methods often fail in low signal-to-noise ratio and variable stripe-like space targets scenarios. We introduce an innovative Collaborative Static-Dynamic Teacher (CSDT) SSL framework, which includes static and dynamic teacher models as well as a student model. We also present MSSA-Net, a novel SSTD network featuring a multi-scale dual-path convolution (MDPC) block and a feature map weighted attention (FMWA) block.
arXiv Detail & Related papers (2024-08-09T12:33:27Z)
Memory-guided Network with Uncertainty-based Feature Augmentation for Few-shot Semantic Segmentation [12.653336728447654]
We propose a class-shared memory (CSM) module consisting of a set of learnable memory vectors. These memory vectors learn elemental object patterns from base classes during training whilst re-encoding query features during both training and inference. We integrate CSM and UFA into representative FSS works, with experimental results on the widely-used PASCAL-5$i$ and COCO-20$i$ datasets.
arXiv Detail & Related papers (2024-06-01T19:53:25Z)
Dynamic Feature Learning and Matching for Class-Incremental Learning [20.432575325147894]
Class-incremental learning (CIL) has emerged as a means to learn new classes without catastrophic forgetting of previous classes. We propose the Dynamic Feature Learning and Matching (DFLM) model in this paper. Our proposed model achieves significant performance improvements over existing methods.
arXiv Detail & Related papers (2024-05-14T12:17:19Z)
I2CANSAY:Inter-Class Analogical Augmentation and Intra-Class Significance Analysis for Non-Exemplar Online Task-Free Continual Learning [42.608860809847236]
Online task-free continual learning (OTFCL) is a more challenging variant of continual learning. Existing methods rely on a memory buffer composed of old samples to prevent forgetting. We propose a novel framework called I2CANSAY that gets rid of the dependence on memory buffers and efficiently learns the knowledge of new data from one-shot samples.
arXiv Detail & Related papers (2024-04-21T08:28:52Z)
Expandable Subspace Ensemble for Pre-Trained Model-Based Class-Incremental Learning [65.57123249246358]
We propose ExpAndable Subspace Ensemble (EASE) for PTM-based CIL. We train a distinct lightweight adapter module for each new task, aiming to create task-specific subspaces. Our prototype complement strategy synthesizes old classes' new features without using any old class instance.
arXiv Detail & Related papers (2024-03-18T17:58:13Z)
Online Calibration of Deep Learning Sub-Models for Hybrid Numerical Modeling Systems [34.50407690251862]
We present an efficient and practical online learning approach for hybrid systems. We demonstrate that the method, called EGA for Euler Gradient Approximation, converges to the exact gradients in the limit of infinitely small time steps. Results show significant improvements over offline learning, highlighting the potential of end-to-end online learning for hybrid modeling.
arXiv Detail & Related papers (2023-11-17T17:36:26Z)
Sparse Modular Activation for Efficient Sequence Modeling [94.11125833685583]
Recent models combining Linear State Space Models with self-attention mechanisms have demonstrated impressive results across a range of sequence modeling tasks. Current approaches apply attention modules statically and uniformly to all elements in the input sequences, leading to sub-optimal quality-efficiency trade-offs. We introduce Sparse Modular Activation (SMA), a general mechanism enabling neural networks to sparsely activate sub-modules for sequence elements in a differentiable manner.
arXiv Detail & Related papers (2023-06-19T23:10:02Z)
Neural Attentive Circuits [93.95502541529115]
We introduce a general purpose, yet modular neural architecture called Neural Attentive Circuits (NACs) NACs learn the parameterization and a sparse connectivity of neural modules without using domain knowledge. NACs achieve an 8x speedup at inference time while losing less than 3% performance.
arXiv Detail & Related papers (2022-10-14T18:00:07Z)
Switchable Representation Learning Framework with Self-compatibility [50.48336074436792]
We propose a Switchable representation learning Framework with Self-Compatibility (SFSC) SFSC generates a series of compatible sub-models with different capacities through one training process. SFSC achieves state-of-the-art performance on the evaluated datasets.
arXiv Detail & Related papers (2022-06-16T16:46:32Z)
FOSTER: Feature Boosting and Compression for Class-Incremental Learning [52.603520403933985]
Deep neural networks suffer from catastrophic forgetting when learning new categories. We propose a novel two-stage learning paradigm FOSTER, empowering the model to learn new categories adaptively.
arXiv Detail & Related papers (2022-04-10T11:38:33Z)
Few-Shot Class-Incremental Learning by Sampling Multi-Phase Tasks [59.12108527904171]
A model should recognize new classes and maintain discriminability over old classes. The task of recognizing few-shot new classes without forgetting old classes is called few-shot class-incremental learning (FSCIL) We propose a new paradigm for FSCIL based on meta-learning by LearnIng Multi-phase Incremental Tasks (LIMIT)
arXiv Detail & Related papers (2022-03-31T13:46:41Z)
Incremental Few-Shot Learning via Implanting and Compressing [13.122771115838523]
Incremental Few-Shot Learning requires a model to continually learn novel classes from only a few examples. We propose a two-step learning strategy referred to as textbfImplanting and textbfCompressing. Specifically, in the textbfImplanting step, we propose to mimic the data distribution of novel classes with the assistance of data-abundant base set. In the textbf step, we adapt the feature extractor to precisely represent each novel class for enhancing intra-class compactness.
arXiv Detail & Related papers (2022-03-19T11:04:43Z)
Trajectory-wise Multiple Choice Learning for Dynamics Generalization in Reinforcement Learning [137.39196753245105]
We present a new model-based reinforcement learning algorithm that learns a multi-headed dynamics model for dynamics generalization. We incorporate context learning, which encodes dynamics-specific information from past experiences into the context latent vector. Our method exhibits superior zero-shot generalization performance across a variety of control tasks, compared to state-of-the-art RL methods.
arXiv Detail & Related papers (2020-10-26T03:20:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.