Continual Deep Learning on the Edge via Stochastic Local Competition among Subnetworks
- URL: http://arxiv.org/abs/2407.10758v1
- Date: Mon, 15 Jul 2024 14:36:05 GMT
- Title: Continual Deep Learning on the Edge via Stochastic Local Competition among Subnetworks
- Authors: Theodoros Christophides, Kyriakos Tolias, Sotirios Chatzis,
- Abstract summary: Continual learning on edge devices poses unique challenges due to stringent resource constraints.
This paper introduces a novel method that leverages competition principles to promote sparsity.
It significantly reduces deep network memory footprint and computational demand.
- Score: 6.367254849444475
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Continual learning on edge devices poses unique challenges due to stringent resource constraints. This paper introduces a novel method that leverages stochastic competition principles to promote sparsity, significantly reducing deep network memory footprint and computational demand. Specifically, we propose deep networks that comprise blocks of units that compete locally to win the representation of each arising new task; competition takes place in a stochastic manner. This type of network organization results in sparse task-specific representations from each network layer; the sparsity pattern is obtained during training and is different among tasks. Crucially, our method sparsifies both the weights and the weight gradients, thus facilitating training on edge devices. This is performed on the grounds of winning probability for each unit in a block. During inference, the network retains only the winning unit and zeroes-out all weights pertaining to non-winning units for the task at hand. Thus, our approach is specifically tailored for deployment on edge devices, providing an efficient and scalable solution for continual learning in resource-limited environments.
Related papers
- Competing Mutual Information Constraints with Stochastic
Competition-based Activations for Learning Diversified Representations [5.981521556433909]
This work aims to address the long-established problem of learning diversified representations.
We combine information-theoretic arguments with competition-based activations.
As we experimentally show, the resulting networks yield significant discnative representation learning abilities.
arXiv Detail & Related papers (2022-01-10T20:12:13Z) - Stochastic Local Winner-Takes-All Networks Enable Profound Adversarial
Robustness [9.017401570529135]
This work explores the potency of competition-based activations, namely Local Winner-Takes-All (LWTA)
We replace the conventional Reversa-based nonlinearities with blocks comprising locally andally competing linear units.
As we experimentally show, the arising networks yield state-of-the-art robustness against powerful adversarial attacks.
arXiv Detail & Related papers (2021-12-05T20:00:10Z) - FreeTickets: Accurate, Robust and Efficient Deep Ensemble by Training
with Dynamic Sparsity [74.58777701536668]
We introduce the FreeTickets concept, which can boost the performance of sparse convolutional neural networks over their dense network equivalents by a large margin.
We propose two novel efficient ensemble methods with dynamic sparsity, which yield in one shot many diverse and accurate tickets "for free" during the sparse training process.
arXiv Detail & Related papers (2021-06-28T10:48:20Z) - All at Once Network Quantization via Collaborative Knowledge Transfer [56.95849086170461]
We develop a novel collaborative knowledge transfer approach for efficiently training the all-at-once quantization network.
Specifically, we propose an adaptive selection strategy to choose a high-precision enquoteteacher for transferring knowledge to the low-precision student.
To effectively transfer knowledge, we develop a dynamic block swapping method by randomly replacing the blocks in the lower-precision student network with the corresponding blocks in the higher-precision teacher network.
arXiv Detail & Related papers (2021-03-02T03:09:03Z) - Sparsity in Deep Learning: Pruning and growth for efficient inference
and training in neural networks [78.47459801017959]
Sparsity can reduce the memory footprint of regular networks to fit mobile devices.
We describe approaches to remove and add elements of neural networks, different training strategies to achieve model sparsity, and mechanisms to exploit sparsity in practice.
arXiv Detail & Related papers (2021-01-31T22:48:50Z) - Mixed-Privacy Forgetting in Deep Networks [114.3840147070712]
We show that the influence of a subset of the training samples can be removed from the weights of a network trained on large-scale image classification tasks.
Inspired by real-world applications of forgetting techniques, we introduce a novel notion of forgetting in mixed-privacy setting.
We show that our method allows forgetting without having to trade off the model accuracy.
arXiv Detail & Related papers (2020-12-24T19:34:56Z) - Fitting the Search Space of Weight-sharing NAS with Graph Convolutional
Networks [100.14670789581811]
We train a graph convolutional network to fit the performance of sampled sub-networks.
With this strategy, we achieve a higher rank correlation coefficient in the selected set of candidates.
arXiv Detail & Related papers (2020-04-17T19:12:39Z) - HYDRA: Pruning Adversarially Robust Neural Networks [58.061681100058316]
Deep learning faces two key challenges: lack of robustness against adversarial attacks and large neural network size.
We propose to make pruning techniques aware of the robust training objective and let the training objective guide the search for which connections to prune.
We demonstrate that our approach, titled HYDRA, achieves compressed networks with state-of-the-art benign and robust accuracy, simultaneously.
arXiv Detail & Related papers (2020-02-24T19:54:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.