Related papers: CBPNet: A Continual Backpropagation Prompt Network for Alleviating Plasticity Loss on Edge Devices

CBPNet: A Continual Backpropagation Prompt Network for Alleviating Plasticity Loss on Edge Devices

URL: http://arxiv.org/abs/2509.15785v1
Date: Fri, 19 Sep 2025 09:16:54 GMT
Title: CBPNet: A Continual Backpropagation Prompt Network for Alleviating Plasticity Loss on Edge Devices
Authors: Runjie Shao, Boyu Diao, Zijia An, Ruiqi Liu, Yongjun Xu,
Abstract summary: We argue that the reduction in plasticity stems from a lack of update vitality in underutilized parameters during the training process.<n>We propose the Continual Backpropagation Prompt Network (CBPNet), an effective and parameter efficient framework designed to restore the model's learning vitality.
Score: 16.318540474216416
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: To meet the demands of applications like robotics and autonomous driving that require real-time responses to dynamic environments, efficient continual learning methods suitable for edge devices have attracted increasing attention. In this transition, using frozen pretrained models with prompts has become a mainstream strategy to combat catastrophic forgetting. However, this approach introduces a new critical bottleneck: plasticity loss, where the model's ability to learn new knowledge diminishes due to the frozen backbone and the limited capacity of prompt parameters. We argue that the reduction in plasticity stems from a lack of update vitality in underutilized parameters during the training process. To this end, we propose the Continual Backpropagation Prompt Network (CBPNet), an effective and parameter efficient framework designed to restore the model's learning vitality. We innovatively integrate an Efficient CBP Block that counteracts plasticity decay by adaptively reinitializing these underutilized parameters. Experimental results on edge devices demonstrate CBPNet's effectiveness across multiple benchmarks. On Split CIFAR-100, it improves average accuracy by over 1% against a strong baseline, and on the more challenging Split ImageNet-R, it achieves a state of the art accuracy of 69.41%. This is accomplished by training additional parameters that constitute less than 0.2% of the backbone's size, validating our approach.

Related papers

Preserving Plasticity in Continual Learning with Adaptive Linearity Injection [10.641213440191551]
Loss of plasticity in deep neural networks is the gradual reduction in a model's capacity to incrementally learn.<n>Recent work has shown that deep linear networks tend to be resilient towards loss of plasticity.<n>We propose Adaptive Linearization (AdaLin), a general approach that dynamically adapts each neuron's activation function to mitigate plasticity loss.
arXiv Detail & Related papers (2025-05-14T15:36:51Z)
WECAR: An End-Edge Collaborative Inference and Training Framework for WiFi-Based Continuous Human Activity Recognition [23.374051991346633]
We propose WECAR, an end-edge collaborative inference and training framework for WiFi-based continuous HAR.<n>We implement WECAR based on heterogeneous hardware using Jetson Nano as edge devices and the ESP32 as end devices.<n>Our experiments across three public WiFi datasets reveal that WECAR not only outperforms several state-of-the-art methods in performance and parameter efficiency, but also achieves a substantial reduction in the model's parameter count post-optimization.
arXiv Detail & Related papers (2025-03-09T03:40:27Z)
Fishing For Cheap And Efficient Pruners At Initialization [4.433137726540548]
Pruning offers a promising solution to mitigate the associated costs and environmental impact of deploying large deep neural networks (DNNs)<n>We introduce Fisher-Taylor Sensitivity (FTS), a computationally cheap and efficient pruning criterion based on the empirical Fisher Information Matrix (FIM) diagonal.<n>Our method achieves competitive performance against state-of-the-art techniques for one-shot PBT, even under extreme sparsity conditions.
arXiv Detail & Related papers (2025-02-17T05:22:23Z)
Advancing Weight and Channel Sparsification with Enhanced Saliency [27.89287351110155]
Pruning aims to accelerate and compress models by removing redundant parameters.<n>This removal is irreversible, often leading to subpar performance in pruned models.<n>We introduce an efficient, innovative paradigm to enhance a given importance criterion for either unstructured or structured sparsity.
arXiv Detail & Related papers (2025-02-05T22:56:55Z)
ALoRE: Efficient Visual Adaptation via Aggregating Low Rank Experts [71.91042186338163]
ALoRE is a novel PETL method that reuses the hypercomplex parameterized space constructed by Kronecker product to Aggregate Low Rank Experts.<n>Thanks to the artful design, ALoRE maintains negligible extra parameters and can be effortlessly merged into the frozen backbone.
arXiv Detail & Related papers (2024-12-11T12:31:30Z)
Pushing the Limits of Sparsity: A Bag of Tricks for Extreme Pruning [32.918269107547616]
Pruning of deep neural networks has been an effective technique for reducing model size while preserving most of the performance of dense networks.<n>Recent sparse learning methods have shown promising performance up to moderate sparsity levels such as 95% and 98%.<n>We propose a collection of techniques that enable the continuous learning of networks without accuracy collapse even at extreme sparsities.
arXiv Detail & Related papers (2024-11-20T18:54:53Z)
FOSTER: Feature Boosting and Compression for Class-Incremental Learning [52.603520403933985]
Deep neural networks suffer from catastrophic forgetting when learning new categories. We propose a novel two-stage learning paradigm FOSTER, empowering the model to learn new categories adaptively.
arXiv Detail & Related papers (2022-04-10T11:38:33Z)
Online Convolutional Re-parameterization [51.97831675242173]
We present online convolutional re- parameterization (OREPA), a two-stage pipeline, aiming to reduce the huge training overhead by squeezing the complex training-time block into a single convolution. Compared with the state-of-the-art re-param models, OREPA is able to save the training-time memory cost by about 70% and accelerate the training speed by around 2x. We also conduct experiments on object detection and semantic segmentation and show consistent improvements on the downstream tasks.
arXiv Detail & Related papers (2022-04-02T09:50:19Z)
Powerpropagation: A sparsity inducing weight reparameterisation [65.85142037667065]
We introduce Powerpropagation, a new weight- parameterisation for neural networks that leads to inherently sparse models. Models trained in this manner exhibit similar performance, but have a distribution with markedly higher density at zero, allowing more parameters to be pruned safely. Here, we combine Powerpropagation with a traditional weight-pruning technique as well as recent state-of-the-art sparse-to-sparse algorithms, showing superior performance on the ImageNet benchmark.
arXiv Detail & Related papers (2021-10-01T10:03:57Z)
Layer Pruning on Demand with Intermediate CTC [50.509073206630994]
We present a training and pruning method for ASR based on the connectionist temporal classification (CTC) We show that a Transformer-CTC model can be pruned in various depth on demand, improving real-time factor from 0.005 to 0.002 on GPU.
arXiv Detail & Related papers (2021-06-17T02:40:18Z)
Extrapolation for Large-batch Training in Deep Learning [72.61259487233214]
We show that a host of variations can be covered in a unified framework that we propose. We prove the convergence of this novel scheme and rigorously evaluate its empirical performance on ResNet, LSTM, and Transformer.
arXiv Detail & Related papers (2020-06-10T08:22:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.