Now that I can see, I can improve: Enabling data-driven finetuning of
CNNs on the edge
- URL: http://arxiv.org/abs/2006.08554v1
- Date: Mon, 15 Jun 2020 17:16:45 GMT
- Title: Now that I can see, I can improve: Enabling data-driven finetuning of
CNNs on the edge
- Authors: Aditya Rajagopal, Christos-Savvas Bouganis
- Abstract summary: This paper provides a first step towards enabling CNN finetuning on an edge device based on structured pruning.
It explores the performance gains and costs of doing so and presents an open-source framework that allows the deployment of such approaches.
- Score: 11.789983276366987
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In today's world, a vast amount of data is being generated by edge devices
that can be used as valuable training data to improve the performance of
machine learning algorithms in terms of the achieved accuracy or to reduce the
compute requirements of the model. However, due to user data privacy concerns
as well as storage and communication bandwidth limitations, this data cannot be
moved from the device to the data centre for further improvement of the model
and subsequent deployment. As such there is a need for increased edge
intelligence, where the deployed models can be fine-tuned on the edge, leading
to improved accuracy and/or reducing the model's workload as well as its memory
and power footprint. In the case of Convolutional Neural Networks (CNNs), both
the weights of the network as well as its topology can be tuned to adapt to the
data that it processes. This paper provides a first step towards enabling CNN
finetuning on an edge device based on structured pruning. It explores the
performance gains and costs of doing so and presents an extensible open-source
framework that allows the deployment of such approaches on a wide range of
network architectures and devices. The results show that on average, data-aware
pruning with retraining can provide 10.2pp increased accuracy over a wide range
of subsets, networks and pruning levels with a maximum improvement of 42.0pp
over pruning and retraining in a manner agnostic to the data being processed by
the network.
Related papers
- Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment [61.20689382879937]
Task-oriented edge computing addresses this by shifting data analysis to the edge.
Existing methods struggle to balance high model performance with low resource consumption.
We propose a novel co-design framework to optimize neural network architecture.
arXiv Detail & Related papers (2024-10-29T19:02:54Z) - Toward Efficient Convolutional Neural Networks With Structured Ternary Patterns [1.1965844936801797]
Convolutional neural networks (ConvNets) exert severe demands on local device resources.
This brief presents work toward utilizing static convolutional filters to design efficient ConvNet architectures.
arXiv Detail & Related papers (2024-07-20T10:18:42Z) - A Generalization of Continuous Relaxation in Structured Pruning [0.3277163122167434]
Trends indicate that deeper and larger neural networks with an increasing number of parameters achieve higher accuracy than smaller neural networks.
We generalize structured pruning with algorithms for network augmentation, pruning, sub-network collapse and removal.
The resulting CNN executes efficiently on GPU hardware without computationally expensive sparse matrix operations.
arXiv Detail & Related papers (2023-08-28T14:19:13Z) - Dataset Quantization [72.61936019738076]
We present dataset quantization (DQ), a new framework to compress large-scale datasets into small subsets.
DQ is the first method that can successfully distill large-scale datasets such as ImageNet-1k with a state-of-the-art compression ratio.
arXiv Detail & Related papers (2023-08-21T07:24:29Z) - Benchmarking Test-Time Unsupervised Deep Neural Network Adaptation on
Edge Devices [19.335535517714703]
The prediction accuracy of the deep neural networks (DNNs) after deployment at the edge can suffer with time due to shifts in the distribution of the new data.
Recent prediction-time unsupervised DNN adaptation techniques have been introduced that improve prediction accuracy of the models for noisy data by re-tuning the batch normalization parameters.
This paper, for the first time, performs a comprehensive measurement study of such techniques to quantify their performance and energy on various edge devices.
arXiv Detail & Related papers (2022-03-21T19:10:40Z) - EffCNet: An Efficient CondenseNet for Image Classification on NXP
BlueBox [0.0]
Edge devices offer limited processing power due to their inexpensive hardware, and limited cooling and computational resources.
We propose a novel deep convolutional neural network architecture called EffCNet for edge devices.
arXiv Detail & Related papers (2021-11-28T21:32:31Z) - perf4sight: A toolflow to model CNN training performance on Edge GPUs [16.61258138725983]
This work proposes perf4sight, an automated methodology for developing accurate models that predict CNN training memory footprint and latency.
With PyTorch as the framework and NVIDIA Jetson TX2 as the target device, the developed models predict training memory footprint and latency with 95% and 91% accuracy respectively.
arXiv Detail & Related papers (2021-08-12T07:55:37Z) - Training Robust Graph Neural Networks with Topology Adaptive Edge
Dropping [116.26579152942162]
Graph neural networks (GNNs) are processing architectures that exploit graph structural information to model representations from network data.
Despite their success, GNNs suffer from sub-optimal generalization performance given limited training data.
This paper proposes Topology Adaptive Edge Dropping to improve generalization performance and learn robust GNN models.
arXiv Detail & Related papers (2021-06-05T13:20:36Z) - Rapid Structural Pruning of Neural Networks with Set-based Task-Adaptive
Meta-Pruning [83.59005356327103]
A common limitation of most existing pruning techniques is that they require pre-training of the network at least once before pruning.
We propose STAMP, which task-adaptively prunes a network pretrained on a large reference dataset by generating a pruning mask on it as a function of the target dataset.
We validate STAMP against recent advanced pruning methods on benchmark datasets.
arXiv Detail & Related papers (2020-06-22T10:57:43Z) - A Privacy-Preserving-Oriented DNN Pruning and Mobile Acceleration
Framework [56.57225686288006]
Weight pruning of deep neural networks (DNNs) has been proposed to satisfy the limited storage and computing capability of mobile edge devices.
Previous pruning methods mainly focus on reducing the model size and/or improving performance without considering the privacy of user data.
We propose a privacy-preserving-oriented pruning and mobile acceleration framework that does not require the private training dataset.
arXiv Detail & Related papers (2020-03-13T23:52:03Z) - Large-Scale Gradient-Free Deep Learning with Recursive Local
Representation Alignment [84.57874289554839]
Training deep neural networks on large-scale datasets requires significant hardware resources.
Backpropagation, the workhorse for training these networks, is an inherently sequential process that is difficult to parallelize.
We propose a neuro-biologically-plausible alternative to backprop that can be used to train deep networks.
arXiv Detail & Related papers (2020-02-10T16:20:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.