Update Compression for Deep Neural Networks on the Edge
- URL: http://arxiv.org/abs/2203.04516v1
- Date: Wed, 9 Mar 2022 04:20:43 GMT
- Title: Update Compression for Deep Neural Networks on the Edge
- Authors: Bo Chen, Ali Bakhshi, Gustavo Batista, Brian Ng, Tat-Jun Chin
- Abstract summary: An increasing number of AI applications involve the execution of deep neural networks (DNNs) on edge devices.
Many practical reasons motivate the need to update the DNN model on the edge device post-deployment.
We develop a simple approach based on matrix factorisation to compress the model update.
- Score: 33.57905298104467
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: An increasing number of artificial intelligence (AI) applications involve the
execution of deep neural networks (DNNs) on edge devices. Many practical
reasons motivate the need to update the DNN model on the edge device
post-deployment, such as refining the model, concept drift, or outright change
in the learning task. In this paper, we consider the scenario where retraining
can be done on the server side based on a copy of the DNN model, with only the
necessary data transmitted to the edge to update the deployed model. However,
due to bandwidth constraints, we want to minimise the transmission required to
achieve the update. We develop a simple approach based on matrix factorisation
to compress the model update -- this differs from compressing the model itself.
The key idea is to preserve existing knowledge in the current model and
optimise only small additional parameters for the update which can be used to
reconstitute the model on the edge. We compared our method to similar
techniques used in federated learning; our method usually requires less than
half of the update size of existing methods to achieve the same accuracy.
Related papers
- Towards Robust and Efficient Cloud-Edge Elastic Model Adaptation via Selective Entropy Distillation [56.79064699832383]
We establish a Cloud-Edge Elastic Model Adaptation (CEMA) paradigm in which the edge models only need to perform forward propagation.
In our CEMA, to reduce the communication burden, we devise two criteria to exclude unnecessary samples from uploading to the cloud.
arXiv Detail & Related papers (2024-02-27T08:47:19Z) - Towards a Better Theoretical Understanding of Independent Subnetwork Training [56.24689348875711]
We take a closer theoretical look at Independent Subnetwork Training (IST)
IST is a recently proposed and highly effective technique for solving the aforementioned problems.
We identify fundamental differences between IST and alternative approaches, such as distributed methods with compressed communication.
arXiv Detail & Related papers (2023-06-28T18:14:22Z) - Boundary Unlearning [5.132489421775161]
We propose Boundary Unlearning, a rapid yet effective way to unlearn an entire class from a trained machine learning model.
We extensively evaluate Boundary Unlearning on image classification and face recognition tasks, with an expected speed-up of $17times$ and $19times$, respectively.
arXiv Detail & Related papers (2023-03-21T03:33:18Z) - Adversarial Learning Networks: Source-free Unsupervised Domain
Incremental Learning [0.0]
In a non-stationary environment, updating a DNN model requires parameter re-training or model fine-tuning.
We propose an unsupervised source-free method to update DNN classification models.
Unlike existing methods, our approach can update a DNN model incrementally for non-stationary source and target tasks without storing past training data.
arXiv Detail & Related papers (2023-01-28T02:16:13Z) - Paoding: Supervised Robustness-preserving Data-free Neural Network
Pruning [3.6953655494795776]
We study the neural network pruning in the emphdata-free context.
We replace the traditional aggressive one-shot strategy with a conservative one that treats the pruning as a progressive process.
Our method is implemented as a Python package named textscPaoding and evaluated with a series of experiments on diverse neural network models.
arXiv Detail & Related papers (2022-04-02T07:09:17Z) - LCS: Learning Compressible Subspaces for Adaptive Network Compression at
Inference Time [57.52251547365967]
We propose a method for training a "compressible subspace" of neural networks that contains a fine-grained spectrum of models.
We present results for achieving arbitrarily fine-grained accuracy-efficiency trade-offs at inference time for structured and unstructured sparsity.
Our algorithm extends to quantization at variable bit widths, achieving accuracy on par with individually trained networks.
arXiv Detail & Related papers (2021-10-08T17:03:34Z) - Developing RNN-T Models Surpassing High-Performance Hybrid Models with
Customization Capability [46.73349163361723]
Recurrent neural network transducer (RNN-T) is a promising end-to-end (E2E) model that may replace the popular hybrid model for automatic speech recognition.
We describe our recent development of RNN-T models with reduced GPU memory consumption during training.
We study how to customize RNN-T models to a new domain, which is important for deploying E2E models to practical scenarios.
arXiv Detail & Related papers (2020-07-30T02:35:20Z) - Dynamic Model Pruning with Feedback [64.019079257231]
We propose a novel model compression method that generates a sparse trained model without additional overhead.
We evaluate our method on CIFAR-10 and ImageNet, and show that the obtained sparse models can reach the state-of-the-art performance of dense models.
arXiv Detail & Related papers (2020-06-12T15:07:08Z) - Conditional Neural Architecture Search [5.466990830092397]
It is often the case a well-trained ML model does not fit to the constraint of deploying edge platforms.
We propose a conditional neural architecture search method using GAN, which produces feasible ML models for different platforms.
arXiv Detail & Related papers (2020-06-06T20:39:33Z) - Model Fusion via Optimal Transport [64.13185244219353]
We present a layer-wise model fusion algorithm for neural networks.
We show that this can successfully yield "one-shot" knowledge transfer between neural networks trained on heterogeneous non-i.i.d. data.
arXiv Detail & Related papers (2019-10-12T22:07:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.