Related papers: Low-Complexity Inference in Continual Learning via Compressed Knowledge Transfer

Low-Complexity Inference in Continual Learning via Compressed Knowledge Transfer

URL: http://arxiv.org/abs/2505.08327v1
Date: Tue, 13 May 2025 08:07:40 GMT
Title: Low-Complexity Inference in Continual Learning via Compressed Knowledge Transfer
Authors: Zhenrong Liu, Janne M. J. Huttunen, Mikko Honkala,
Abstract summary: Continual learning (CL) aims to train models that can learn a sequence of tasks without forgetting previously acquired knowledge.<n>Recently, large pre-trained models have been widely adopted in CL for their ability to support both.<n>We propose two efficient frameworks tailored for class-incremental learning.
Score: 5.079602839359523
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Continual learning (CL) aims to train models that can learn a sequence of tasks without forgetting previously acquired knowledge. A core challenge in CL is balancing stability -- preserving performance on old tasks -- and plasticity -- adapting to new ones. Recently, large pre-trained models have been widely adopted in CL for their ability to support both, offering strong generalization for new tasks and resilience against forgetting. However, their high computational cost at inference time limits their practicality in real-world applications, especially those requiring low latency or energy efficiency. To address this issue, we explore model compression techniques, including pruning and knowledge distillation (KD), and propose two efficient frameworks tailored for class-incremental learning (CIL), a challenging CL setting where task identities are unavailable during inference. The pruning-based framework includes pre- and post-pruning strategies that apply compression at different training stages. The KD-based framework adopts a teacher-student architecture, where a large pre-trained teacher transfers downstream-relevant knowledge to a compact student. Extensive experiments on multiple CIL benchmarks demonstrate that the proposed frameworks achieve a better trade-off between accuracy and inference complexity, consistently outperforming strong baselines. We further analyze the trade-offs between the two frameworks in terms of accuracy and efficiency, offering insights into their use across different scenarios.

Related papers

Continual Learning Beyond Experience Rehearsal and Full Model Surrogates [17.236861687708096]
Continual learning has remained a significant challenge for deep neural networks.<n>Existing solutions often rely on experience rehearsal or full model surrogates to mitigate.<n>We propose a scalable CL approach that eliminates the need for experience rehearsal and full-model surrogates.
arXiv Detail & Related papers (2025-05-28T03:52:34Z)
A Unified Gradient-based Framework for Task-agnostic Continual Learning-Unlearning [30.2773429357068]
Recent advancements in deep models have highlighted the need for intelligent systems that combine continual learning (CL) for knowledge acquisition with machine unlearning (MU) for data removal.<n>We reveal their intrinsic connection through a unified optimization framework based on Kullback-Leibler divergence minimization.<n>Experiments demonstrate that the proposed UG-CLU framework effectively coordinates incremental learning, precise unlearning, and knowledge stability across multiple datasets and model architectures.
arXiv Detail & Related papers (2025-05-21T06:49:05Z)
Continual Learning Should Move Beyond Incremental Classification [51.23416308775444]
Continual learning (CL) is the sub-field of machine learning concerned with accumulating knowledge in dynamic environments.<n>Here, we argue that maintaining such a focus limits both theoretical development and practical applicability of CL methods.<n>We identify three fundamental challenges: (C1) the nature of continuity in learning problems, (C2) the choice of appropriate spaces and metrics for measuring similarity, and (C3) the role of learning objectives beyond classification.
arXiv Detail & Related papers (2025-02-17T15:40:13Z)
DATA: Decomposed Attention-based Task Adaptation for Rehearsal-Free Continual Learning [22.386864304549285]
Continual learning (CL) is essential for Large Language Models (LLMs) to adapt to evolving real-world demands.<n>Recent rehearsal-free methods employ model-based and regularization-based strategies to address this issue.<n>We propose a $textbfD$e $textbfA$ttention-based $textbfTask $textbfA$daptation ( DATA)<n> DATA explicitly decouples and learns both task-specific and task-shared knowledge using high-rank and low-rank task adapters.
arXiv Detail & Related papers (2025-02-17T06:35:42Z)
Adaptive Rank, Reduced Forgetting: Knowledge Retention in Continual Learning Vision-Language Models with Dynamic Rank-Selective LoRA [19.982853959240497]
Pre-trained vision-language embedding models such as CLIP have been widely adopted and validated in Continual Learning (CL)<n>Existing CL methods primarily focus on continual downstream adaptation using components isolated from the pre-trained model (PTM)<n>We propose a universal and efficient CL approach for CLIP based on Dynamic Rank-Selective LoRA (CoDyRA)
arXiv Detail & Related papers (2024-12-01T23:41:42Z)
Continual Task Learning through Adaptive Policy Self-Composition [54.95680427960524]
CompoFormer is a structure-based continual transformer model that adaptively composes previous policies via a meta-policy network. Our experiments reveal that CompoFormer outperforms conventional continual learning (CL) methods, particularly in longer task sequences.
arXiv Detail & Related papers (2024-11-18T08:20:21Z)
Temporal-Difference Variational Continual Learning [89.32940051152782]
We propose new learning objectives that integrate the regularization effects of multiple previous posterior estimations.<n>Our approach effectively mitigates Catastrophic Forgetting, outperforming strong Variational CL methods.
arXiv Detail & Related papers (2024-10-10T10:58:41Z)
TSVD: Bridging Theory and Practice in Continual Learning with Pre-trained Models [103.45785408116146]
Continual learning (CL) aims to train a model that can solve multiple tasks presented sequentially.<n>Recent CL approaches have achieved strong performance by leveraging large pre-trained models that generalize well to downstream tasks.<n>However, such methods lack theoretical guarantees, making them prone to unexpected failures.<n>We aim to bridge this gap by designing a simple CL method that is theoretically sound and highly performant.
arXiv Detail & Related papers (2024-10-01T12:58:37Z)
Theory on Mixture-of-Experts in Continual Learning [72.42497633220547]
Continual learning (CL) has garnered significant attention because of its ability to adapt to new tasks that arrive over time.<n>Catastrophic forgetting (of old tasks) has been identified as a major issue in CL, as the model adapts to new tasks.<n>MoE model has recently been shown to effectively mitigate catastrophic forgetting in CL, by employing a gating network.
arXiv Detail & Related papers (2024-06-24T08:29:58Z)
Density Distribution-based Learning Framework for Addressing Online Continual Learning Challenges [4.715630709185073]
We introduce a density distribution-based learning framework for online Continual Learning. Our framework achieves superior average accuracy and time-space efficiency. Our method outperforms popular CL approaches by a significant margin.
arXiv Detail & Related papers (2023-11-22T09:21:28Z)
Continual Learners are Incremental Model Generalizers [70.34479702177988]
This paper extensively studies the impact of Continual Learning (CL) models as pre-trainers. We find that the transfer quality of the representation often increases gradually without noticeable degradation in fine-tuning performance. We propose a new fine-tuning scheme, GLobal Attention Discretization (GLAD), that preserves rich task-generic representation during solving downstream tasks.
arXiv Detail & Related papers (2023-06-21T05:26:28Z)

This list is automatically generated from the titles and abstracts of the papers in this site.