DRL: Discriminative Representation Learning with Parallel Adapters for Class Incremental Learning
- URL: http://arxiv.org/abs/2510.12107v1
- Date: Tue, 14 Oct 2025 03:19:15 GMT
- Title: DRL: Discriminative Representation Learning with Parallel Adapters for Class Incremental Learning
- Authors: Jiawei Zhan, Jun Liu, Jinlong Peng, Xiaochen Chen, Bin-Bin Gao, Yong Liu, Chengjie Wang,
- Abstract summary: We propose the Discriminative Representation Learning (DRL) framework to specifically address these challenges.<n>To conduct incremental learning effectively and yet efficiently, the DRL's network is built upon a PTM.<n>Our DRL consistently outperforms other state-of-the-art methods throughout the entire CIL period.
- Score: 63.65467569295623
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: With the excellent representation capabilities of Pre-Trained Models (PTMs), remarkable progress has been made in non-rehearsal Class-Incremental Learning (CIL) research. However, it remains an extremely challenging task due to three conundrums: increasingly large model complexity, non-smooth representation shift during incremental learning and inconsistency between stage-wise sub-problem optimization and global inference. In this work, we propose the Discriminative Representation Learning (DRL) framework to specifically address these challenges. To conduct incremental learning effectively and yet efficiently, the DRL's network, called Incremental Parallel Adapter (IPA) network, is built upon a PTM and increasingly augments the model by learning a lightweight adapter with a small amount of parameter learning overhead in each incremental stage. The adapter is responsible for adapting the model to new classes, it can inherit and propagate the representation capability from the current model through parallel connection between them by a transfer gate. As a result, this design guarantees a smooth representation shift between different incremental stages. Furthermore, to alleviate inconsistency and enable comparable feature representations across incremental stages, we design the Decoupled Anchor Supervision (DAS). It decouples constraints of positive and negative samples by respectively comparing them with the virtual anchor. This decoupling promotes discriminative representation learning and aligns the feature spaces learned at different stages, thereby narrowing the gap between stage-wise local optimization over a subset of data and global inference across all classes. Extensive experiments on six benchmarks reveal that our DRL consistently outperforms other state-of-the-art methods throughout the entire CIL period while maintaining high efficiency in both training and inference phases.
Related papers
- Nemotron-Cascade: Scaling Cascaded Reinforcement Learning for General-Purpose Reasoning Models [71.9060068259379]
We propose cascaded domain-wise reinforcement learning to build general-purpose reasoning models.<n>Our 14B model, after RL, outperforms its SFT teacher, DeepSeek-R1-0528, on LiveCodeBench v5/v6 Pro and silver-medal performance in the 2025 International Olympiad in Informatics (IOI)
arXiv Detail & Related papers (2025-12-15T18:02:35Z) - Diffusion-Classifier Synergy: Reward-Aligned Learning via Mutual Boosting Loop for FSCIL [19.094835780362775]
Few-Shot Class-Incremental Learning (FSCIL) challenges models to sequentially learn new classes from minimal examples.<n>Current FSCIL methods often struggle with generalization due to their reliance on limited datasets.<n>This paper introduces Diffusion-Classifier Synergy (DCS), a novel framework that establishes a mutual boosting loop between diffusion model and FSCIL classifier.
arXiv Detail & Related papers (2025-10-04T01:48:52Z) - MMRL++: Parameter-Efficient and Interaction-Aware Representation Learning for Vision-Language Models [4.828668077793944]
Multi-Modal Representation Learning generates space tokens projected into both text and image encoders as representation tokens.<n>MML++ is a parameter-efficient and interaction-aware extension that significantly reduces trainable parameters.<n> experiments on 15 datasets demonstrate that MMRL and MMRL++ consistently outperform state-of-the-art methods.
arXiv Detail & Related papers (2025-05-15T08:43:53Z) - Contrastive-Adversarial and Diffusion: Exploring pre-training and fine-tuning strategies for sulcal identification [3.0398616939692777]
Techniques like adversarial learning, contrastive learning, diffusion denoising learning, and ordinary reconstruction learning have become standard.
The study aims to elucidate the advantages of pre-training techniques and fine-tuning strategies to enhance the learning process of neural networks.
arXiv Detail & Related papers (2024-05-29T15:44:51Z) - Boosting Continual Learning of Vision-Language Models via Mixture-of-Experts Adapters [65.15700861265432]
We present a parameter-efficient continual learning framework to alleviate long-term forgetting in incremental learning with vision-language models.
Our approach involves the dynamic expansion of a pre-trained CLIP model, through the integration of Mixture-of-Experts (MoE) adapters.
To preserve the zero-shot recognition capability of vision-language models, we introduce a Distribution Discriminative Auto-Selector.
arXiv Detail & Related papers (2024-03-18T08:00:23Z) - Neural Architecture for Online Ensemble Continual Learning [6.241435193861262]
We present a fully differentiable ensemble method that allows us to efficiently train an ensemble of neural networks in the end-to-end regime.
The proposed technique achieves SOTA results without a memory buffer and clearly outperforms the reference methods.
arXiv Detail & Related papers (2022-11-27T23:17:08Z) - Improving GANs with A Dynamic Discriminator [106.54552336711997]
We argue that a discriminator with an on-the-fly adjustment on its capacity can better accommodate such a time-varying task.
A comprehensive empirical study confirms that the proposed training strategy, termed as DynamicD, improves the synthesis performance without incurring any additional cost or training objectives.
arXiv Detail & Related papers (2022-09-20T17:57:33Z) - Mimicking the Oracle: An Initial Phase Decorrelation Approach for Class Incremental Learning [141.35105358670316]
We study the difference between a na"ively-trained initial-phase model and the oracle model.
We propose Class-wise Decorrelation (CwD) that effectively regularizes representations of each class to scatter more uniformly.
Our CwD is simple to implement and easy to plug into existing methods.
arXiv Detail & Related papers (2021-12-09T07:20:32Z) - Generalized Zero-Shot Learning Via Over-Complete Distribution [79.5140590952889]
We propose to generate an Over-Complete Distribution (OCD) using Conditional Variational Autoencoder (CVAE) of both seen and unseen classes.
The effectiveness of the framework is evaluated using both Zero-Shot Learning and Generalized Zero-Shot Learning protocols.
arXiv Detail & Related papers (2020-04-01T19:05:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.