On the fly Deep Neural Network Optimization Control for Low-Power
Computer Vision
- URL: http://arxiv.org/abs/2309.01824v1
- Date: Mon, 4 Sep 2023 21:26:26 GMT
- Title: On the fly Deep Neural Network Optimization Control for Low-Power
Computer Vision
- Authors: Ishmeet Kaur, Adwaita Janardhan Jadhav
- Abstract summary: State-of-the-art computer vision techniques rely on large Deep Neural Networks (DNNs) that are usually too power-hungry to be deployed on resource-constrained edge devices.
This paper presents a novel technique to allow DNNs to adapt their accuracy and energy consumption during run-time, without the need for any re-training.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Processing visual data on mobile devices has many applications, e.g.,
emergency response and tracking. State-of-the-art computer vision techniques
rely on large Deep Neural Networks (DNNs) that are usually too power-hungry to
be deployed on resource-constrained edge devices. Many techniques improve the
efficiency of DNNs by using sparsity or quantization. However, the accuracy and
efficiency of these techniques cannot be adapted for diverse edge applications
with different hardware constraints and accuracy requirements. This paper
presents a novel technique to allow DNNs to adapt their accuracy and energy
consumption during run-time, without the need for any re-training. Our
technique called AdaptiveActivation introduces a hyper-parameter that controls
the output range of the DNNs' activation function to dynamically adjust the
sparsity and precision in the DNN. AdaptiveActivation can be applied to any
existing pre-trained DNN to improve their deployability in diverse edge
environments. We conduct experiments on popular edge devices and show that the
accuracy is within 1.5% of the baseline. We also show that our approach
requires 10%--38% less memory than the baseline techniques leading to more
accuracy-efficiency tradeoff options
Related papers
- Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment [61.20689382879937]
Task-oriented edge computing addresses this by shifting data analysis to the edge.
Existing methods struggle to balance high model performance with low resource consumption.
We propose a novel co-design framework to optimize neural network architecture.
arXiv Detail & Related papers (2024-10-29T19:02:54Z) - DCP: Learning Accelerator Dataflow for Neural Network via Propagation [52.06154296196845]
This work proposes an efficient data-centric approach, named Dataflow Code Propagation (DCP), to automatically find the optimal dataflow for DNN layers in seconds without human effort.
DCP learns a neural predictor to efficiently update the dataflow codes towards the desired gradient directions to minimize various optimization objectives.
For example, without using additional training data, DCP surpasses the GAMMA method that performs a full search using thousands of samples.
arXiv Detail & Related papers (2024-10-09T05:16:44Z) - Hardware-Aware DNN Compression via Diverse Pruning and Mixed-Precision
Quantization [1.0235078178220354]
We propose an automated framework to compress Deep Neural Networks (DNNs) in a hardware-aware manner by jointly employing pruning and quantization.
Our framework achieves $39%$ average energy reduction for datasets $1.7%$ average accuracy loss and outperforms significantly the state-of-the-art approaches.
arXiv Detail & Related papers (2023-12-23T18:50:13Z) - Survey on Computer Vision Techniques for Internet-of-Things Devices [0.0]
Deep neural networks (DNNs) are state-of-the-art techniques for solving computer vision problems.
DNNs require billions of parameters and operations to achieve state-of-the-art results.
This requirement makes DNNs extremely compute, memory, and energy-hungry, and consequently difficult to deploy on small battery-powered Internet-of-Things (IoT) devices with limited computing resources.
arXiv Detail & Related papers (2023-08-02T03:41:24Z) - Designing and Training of Lightweight Neural Networks on Edge Devices
using Early Halting in Knowledge Distillation [16.74710649245842]
This paper presents a novel approach for designing and training lightweight Deep Neural Networks (DNN) on edge devices.
The approach considers the available storage, processing speed, and allowable maximum processing time.
We introduce a novel early halting technique, which preserves network resources.
arXiv Detail & Related papers (2022-09-30T16:18:24Z) - Fluid Batching: Exit-Aware Preemptive Serving of Early-Exit Neural
Networks on Edge NPUs [74.83613252825754]
"smart ecosystems" are being formed where sensing happens concurrently rather than standalone.
This is shifting the on-device inference paradigm towards deploying neural processing units (NPUs) at the edge.
We propose a novel early-exit scheduling that allows preemption at run time to account for the dynamicity introduced by the arrival and exiting processes.
arXiv Detail & Related papers (2022-09-27T15:04:01Z) - Benchmarking Test-Time Unsupervised Deep Neural Network Adaptation on
Edge Devices [19.335535517714703]
The prediction accuracy of the deep neural networks (DNNs) after deployment at the edge can suffer with time due to shifts in the distribution of the new data.
Recent prediction-time unsupervised DNN adaptation techniques have been introduced that improve prediction accuracy of the models for noisy data by re-tuning the batch normalization parameters.
This paper, for the first time, performs a comprehensive measurement study of such techniques to quantify their performance and energy on various edge devices.
arXiv Detail & Related papers (2022-03-21T19:10:40Z) - FPGA-optimized Hardware acceleration for Spiking Neural Networks [69.49429223251178]
This work presents the development of a hardware accelerator for an SNN, with off-line training, applied to an image recognition task.
The design targets a Xilinx Artix-7 FPGA, using in total around the 40% of the available hardware resources.
It reduces the classification time by three orders of magnitude, with a small 4.5% impact on the accuracy, if compared to its software, full precision counterpart.
arXiv Detail & Related papers (2022-01-18T13:59:22Z) - Learning to Solve the AC-OPF using Sensitivity-Informed Deep Neural
Networks [52.32646357164739]
We propose a deep neural network (DNN) to solve the solutions of the optimal power flow (ACOPF)
The proposed SIDNN is compatible with a broad range of OPF schemes.
It can be seamlessly integrated in other learning-to-OPF schemes.
arXiv Detail & Related papers (2021-03-27T00:45:23Z) - FracTrain: Fractionally Squeezing Bit Savings Both Temporally and
Spatially for Efficient DNN Training [81.85361544720885]
We propose FracTrain that integrates progressive fractional quantization which gradually increases the precision of activations, weights, and gradients.
FracTrain reduces computational cost and hardware-quantified energy/latency of DNN training while achieving a comparable or better (-0.12%+1.87%) accuracy.
arXiv Detail & Related papers (2020-12-24T05:24:10Z) - FSpiNN: An Optimization Framework for Memory- and Energy-Efficient
Spiking Neural Networks [14.916996986290902]
Spiking Neural Networks (SNNs) offer unsupervised learning capability due to the spike-timing-dependent plasticity (STDP) rule.
However, state-of-the-art SNNs require a large memory footprint to achieve high accuracy.
We propose FSpiNN, an optimization framework for obtaining memory- and energy-efficient SNNs for training and inference processing.
arXiv Detail & Related papers (2020-07-17T09:40:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.