Resource-Efficient Deep Learning: A Survey on Model-, Arithmetic-, and
Implementation-Level Techniques
- URL: http://arxiv.org/abs/2112.15131v1
- Date: Thu, 30 Dec 2021 17:00:06 GMT
- Title: Resource-Efficient Deep Learning: A Survey on Model-, Arithmetic-, and
Implementation-Level Techniques
- Authors: JunKyu Lee, Lev Mukhanov, Amir Sabbagh Molahosseini, Umar Minhas, Yang
Hua, Jesus Martinez del Rincon, Kiril Dichev, Cheol-Ho Hong, Hans
Vandierendonck
- Abstract summary: Deep learning is pervasive in our daily life, including self-driving cars, virtual assistants, social network services, healthcare services, face recognition, etc.
Deep neural networks demand substantial compute resources during training and inference.
This article provides a survey on resource-efficient deep learning techniques in terms of model-, arithmetic-, and implementation-level techniques.
- Score: 10.715525749057495
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep learning is pervasive in our daily life, including self-driving cars,
virtual assistants, social network services, healthcare services, face
recognition, etc. However, deep neural networks demand substantial compute
resources during training and inference. The machine learning community has
mainly focused on model-level optimizations such as architectural compression
of deep learning models, while the system community has focused on
implementation-level optimization. In between, various arithmetic-level
optimization techniques have been proposed in the arithmetic community. This
article provides a survey on resource-efficient deep learning techniques in
terms of model-, arithmetic-, and implementation-level techniques and
identifies the research gaps for resource-efficient deep learning techniques
across the three different level techniques. Our survey clarifies the influence
from higher to lower-level techniques based on our resource-efficiency metric
definition and discusses the future trend for resource-efficient deep learning
research.
Related papers
- Machine Learning Insides OptVerse AI Solver: Design Principles and
Applications [74.67495900436728]
We present a comprehensive study on the integration of machine learning (ML) techniques into Huawei Cloud's OptVerse AI solver.
We showcase our methods for generating complex SAT and MILP instances utilizing generative models that mirror multifaceted structures of real-world problem.
We detail the incorporation of state-of-the-art parameter tuning algorithms which markedly elevate solver performance.
arXiv Detail & Related papers (2024-01-11T15:02:15Z) - Beyond Efficiency: A Systematic Survey of Resource-Efficient Large Language Models [33.50873478562128]
Large Language Models (LLMs) bring forth challenges in the high consumption of computational, memory, energy, and financial resources.
This survey aims to systematically address these challenges by reviewing a broad spectrum of techniques designed to enhance the resource efficiency of LLMs.
arXiv Detail & Related papers (2024-01-01T01:12:42Z) - The Efficiency Spectrum of Large Language Models: An Algorithmic Survey [54.19942426544731]
The rapid growth of Large Language Models (LLMs) has been a driving force in transforming various domains.
This paper examines the multi-faceted dimensions of efficiency essential for the end-to-end algorithmic development of LLMs.
arXiv Detail & Related papers (2023-12-01T16:00:25Z) - Computation-efficient Deep Learning for Computer Vision: A Survey [121.84121397440337]
Deep learning models have reached or even exceeded human-level performance in a range of visual perception tasks.
Deep learning models usually demand significant computational resources, leading to impractical power consumption, latency, or carbon emissions in real-world scenarios.
New research focus is computationally efficient deep learning, which strives to achieve satisfactory performance while minimizing the computational cost during inference.
arXiv Detail & Related papers (2023-08-27T03:55:28Z) - Deep Active Learning with Structured Neural Depth Search [18.180995603975422]
Active-iNAS trains several models and selects the model with the best generalization performance for querying the subsequent samples after each active learning cycle.
We propose a novel active strategy with the method called structured variational inference (SVI) or structured neural depth search (SNDS)
At the same time, we theoretically demonstrate that the current VI-based methods based on the mean-field assumption could lead to poor performance.
arXiv Detail & Related papers (2023-06-05T12:00:12Z) - On Efficient Training of Large-Scale Deep Learning Models: A Literature
Review [90.87691246153612]
The field of deep learning has witnessed significant progress, particularly in computer vision (CV), natural language processing (NLP), and speech.
The use of large-scale models trained on vast amounts of data holds immense promise for practical applications.
With the increasing demands on computational capacity, a comprehensive summarization on acceleration techniques of training deep learning models is still much anticipated.
arXiv Detail & Related papers (2023-04-07T11:13:23Z) - Model-Based Deep Learning [155.063817656602]
Signal processing, communications, and control have traditionally relied on classical statistical modeling techniques.
Deep neural networks (DNNs) use generic architectures which learn to operate from data, and demonstrate excellent performance.
We are interested in hybrid techniques that combine principled mathematical models with data-driven systems to benefit from the advantages of both approaches.
arXiv Detail & Related papers (2020-12-15T16:29:49Z) - Spiking Neural Networks Hardware Implementations and Challenges: a
Survey [53.429871539789445]
Spiking Neural Networks are cognitive algorithms mimicking neuron and synapse operational principles.
We present the state of the art of hardware implementations of spiking neural networks.
We discuss the strategies employed to leverage the characteristics of these event-driven algorithms at the hardware level.
arXiv Detail & Related papers (2020-05-04T13:24:00Z) - Hyper-Parameter Optimization: A Review of Algorithms and Applications [14.524227656147968]
This paper provides a review of the most essential topics on automated hyper- parameter optimization (HPO)
The research focuses on major optimization algorithms and their applicability, covering their efficiency and accuracy especially for deep learning networks.
The paper concludes with problems that exist when HPO is applied to deep learning, a comparison between optimization algorithms, and prominent approaches for model evaluation with limited computational resources.
arXiv Detail & Related papers (2020-03-12T10:12:22Z) - Resource-Efficient Neural Networks for Embedded Systems [23.532396005466627]
We provide an overview of the current state of the art of machine learning techniques.
We focus on resource-efficient inference based on deep neural networks (DNNs), the predominant machine learning models of the past decade.
We substantiate our discussion with experiments on well-known benchmark data sets using compression techniques.
arXiv Detail & Related papers (2020-01-07T14:17:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.