A Survey of Methods for Low-Power Deep Learning and Computer Vision
- URL: http://arxiv.org/abs/2003.11066v1
- Date: Tue, 24 Mar 2020 18:47:24 GMT
- Title: A Survey of Methods for Low-Power Deep Learning and Computer Vision
- Authors: Abhinav Goel, Caleb Tung, Yung-Hsiang Lu, and George K. Thiruvathukal
- Abstract summary: Deep neural networks (DNNs) are successful in many computer vision tasks.
The most accurate DNNs require millions of parameters and operations, making them energy, computation and memory intensive.
Recent research improves DNN models by reducing the memory requirement, energy consumption, and number of operations without significantly decreasing the accuracy.
- Score: 0.4234843176066353
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep neural networks (DNNs) are successful in many computer vision tasks.
However, the most accurate DNNs require millions of parameters and operations,
making them energy, computation and memory intensive. This impedes the
deployment of large DNNs in low-power devices with limited compute resources.
Recent research improves DNN models by reducing the memory requirement, energy
consumption, and number of operations without significantly decreasing the
accuracy. This paper surveys the progress of low-power deep learning and
computer vision, specifically in regards to inference, and discusses the
methods for compacting and accelerating DNN models. The techniques can be
divided into four major categories: (1) parameter quantization and pruning, (2)
compressed convolutional filters and matrix factorization, (3) network
architecture search, and (4) knowledge distillation. We analyze the accuracy,
advantages, disadvantages, and potential solutions to the problems with the
techniques in each category. We also discuss new evaluation metrics as a
guideline for future research.
Related papers
- Survey on Computer Vision Techniques for Internet-of-Things Devices [0.0]
Deep neural networks (DNNs) are state-of-the-art techniques for solving computer vision problems.
DNNs require billions of parameters and operations to achieve state-of-the-art results.
This requirement makes DNNs extremely compute, memory, and energy-hungry, and consequently difficult to deploy on small battery-powered Internet-of-Things (IoT) devices with limited computing resources.
arXiv Detail & Related papers (2023-08-02T03:41:24Z) - To Spike or Not To Spike: A Digital Hardware Perspective on Deep
Learning Acceleration [4.712922151067433]
As deep learning models scale, they become increasingly competitive from domains spanning from computer vision to natural language processing.
The power efficiency of the biological brain outperforms any large-scale deep learning ( DL ) model.
Neuromorphic computing tries to mimic the brain operations to improve the efficiency of DL models.
arXiv Detail & Related papers (2023-06-27T19:04:00Z) - Hardware Approximate Techniques for Deep Neural Network Accelerators: A
Survey [4.856755747052137]
Deep Neural Networks (DNNs) are very popular because of their high performance in various cognitive tasks in Machine Learning (ML)
Recent advancements in DNNs have brought beyond human accuracy in many tasks, but at the cost of high computational complexity.
This article provides a comprehensive survey and analysis of hardware approximation techniques for DNN accelerators.
arXiv Detail & Related papers (2022-03-16T16:33:13Z) - Deep Reinforcement Learning with Spiking Q-learning [51.386945803485084]
spiking neural networks (SNNs) are expected to realize artificial intelligence (AI) with less energy consumption.
It provides a promising energy-efficient way for realistic control tasks by combining SNNs with deep reinforcement learning (RL)
arXiv Detail & Related papers (2022-01-21T16:42:11Z) - A Survey of Quantization Methods for Efficient Neural Network Inference [75.55159744950859]
quantization is the problem of distributing continuous real-valued numbers over a fixed discrete set of numbers to minimize the number of bits required.
It has come to the forefront in recent years due to the remarkable performance of Neural Network models in computer vision, natural language processing, and related areas.
Moving from floating-point representations to low-precision fixed integer values represented in four bits or less holds the potential to reduce the memory footprint and latency by a factor of 16x.
arXiv Detail & Related papers (2021-03-25T06:57:11Z) - MS-RANAS: Multi-Scale Resource-Aware Neural Architecture Search [94.80212602202518]
We propose Multi-Scale Resource-Aware Neural Architecture Search (MS-RANAS)
We employ a one-shot architecture search approach in order to obtain a reduced search cost.
We achieve state-of-the-art results in terms of accuracy-speed trade-off.
arXiv Detail & Related papers (2020-09-29T11:56:01Z) - FSpiNN: An Optimization Framework for Memory- and Energy-Efficient
Spiking Neural Networks [14.916996986290902]
Spiking Neural Networks (SNNs) offer unsupervised learning capability due to the spike-timing-dependent plasticity (STDP) rule.
However, state-of-the-art SNNs require a large memory footprint to achieve high accuracy.
We propose FSpiNN, an optimization framework for obtaining memory- and energy-efficient SNNs for training and inference processing.
arXiv Detail & Related papers (2020-07-17T09:40:26Z) - Boosting Deep Neural Networks with Geometrical Prior Knowledge: A Survey [77.99182201815763]
Deep Neural Networks (DNNs) achieve state-of-the-art results in many different problem settings.
DNNs are often treated as black box systems, which complicates their evaluation and validation.
One promising field, inspired by the success of convolutional neural networks (CNNs) in computer vision tasks, is to incorporate knowledge about symmetric geometrical transformations.
arXiv Detail & Related papers (2020-06-30T14:56:05Z) - Binary Neural Networks: A Survey [126.67799882857656]
The binary neural network serves as a promising technique for deploying deep models on resource-limited devices.
The binarization inevitably causes severe information loss, and even worse, its discontinuity brings difficulty to the optimization of the deep network.
We present a survey of these algorithms, mainly categorized into the native solutions directly conducting binarization, and the optimized ones using techniques like minimizing the quantization error, improving the network loss function, and reducing the gradient error.
arXiv Detail & Related papers (2020-03-31T16:47:20Z) - Widening and Squeezing: Towards Accurate and Efficient QNNs [125.172220129257]
Quantization neural networks (QNNs) are very attractive to the industry because their extremely cheap calculation and storage overhead, but their performance is still worse than that of networks with full-precision parameters.
Most of existing methods aim to enhance performance of QNNs especially binary neural networks by exploiting more effective training techniques.
We address this problem by projecting features in original full-precision networks to high-dimensional quantization features.
arXiv Detail & Related papers (2020-02-03T04:11:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.