Related papers: Resource-Efficient Neural Networks for Embedded Systems

Resource-Efficient Neural Networks for Embedded Systems

URL: http://arxiv.org/abs/2001.03048v3
Date: Sun, 7 Apr 2024 21:04:52 GMT
Title: Resource-Efficient Neural Networks for Embedded Systems
Authors: Wolfgang Roth, Günther Schindler, Bernhard Klein, Robert Peharz, Sebastian Tschiatschek, Holger Fröning, Franz Pernkopf, Zoubin Ghahramani,
Abstract summary: We provide an overview of the current state of the art of machine learning techniques. We focus on resource-efficient inference based on deep neural networks (DNNs), the predominant machine learning models of the past decade. We substantiate our discussion with experiments on well-known benchmark data sets using compression techniques.
Score: 23.532396005466627
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: While machine learning is traditionally a resource intensive task, embedded systems, autonomous navigation, and the vision of the Internet of Things fuel the interest in resource-efficient approaches. These approaches aim for a carefully chosen trade-off between performance and resource consumption in terms of computation and energy. The development of such approaches is among the major challenges in current machine learning research and key to ensure a smooth transition of machine learning technology from a scientific environment with virtually unlimited computing resources into everyday's applications. In this article, we provide an overview of the current state of the art of machine learning techniques facilitating these real-world requirements. In particular, we focus on resource-efficient inference based on deep neural networks (DNNs), the predominant machine learning models of the past decade. We give a comprehensive overview of the vast literature that can be mainly split into three non-mutually exclusive categories: (i) quantized neural networks, (ii) network pruning, and (iii) structural efficiency. These techniques can be applied during training or as post-processing, and they are widely used to reduce the computational demands in terms of memory footprint, inference speed, and energy efficiency. We also briefly discuss different concepts of embedded hardware for DNNs and their compatibility with machine learning techniques as well as potential for energy and latency reduction. We substantiate our discussion with experiments on well-known benchmark data sets using compression techniques (quantization, pruning) for a set of resource-constrained embedded systems, such as CPUs, GPUs and FPGAs. The obtained results highlight the difficulty of finding good trade-offs between resource efficiency and prediction quality.

Related papers

Frugal Machine Learning for Energy-efficient, and Resource-aware Artificial Intelligence [13.783950035836593]
Frugal Machine Learning (FML) refers to the practice of designing Machine Learning (ML) models that are efficient, cost-effective, and mindful of resource constraints.<n>FML strategies can be broadly categorized into input frugality, learning process frugality, and model frugality.<n>This chapter explores recent advancements, applications, and open challenges in FML.
arXiv Detail & Related papers (2025-06-02T16:56:21Z)
Mechanistic Neural Networks for Scientific Machine Learning [58.99592521721158]
We present Mechanistic Neural Networks, a neural network design for machine learning applications in the sciences. It incorporates a new Mechanistic Block in standard architectures to explicitly learn governing differential equations as representations. Central to our approach is a novel Relaxed Linear Programming solver (NeuRLP) inspired by a technique that reduces solving linear ODEs to solving linear programs.
arXiv Detail & Related papers (2024-02-20T15:23:24Z)
Temporal Patience: Efficient Adaptive Deep Learning for Embedded Radar Data Processing [4.359030177348051]
This paper presents novel techniques that leverage the temporal correlation present in streaming radar data to enhance the efficiency of Early Exit Neural Networks for Deep Learning inference on embedded devices. Our results demonstrate that our techniques save up to 26% of operations per inference over a Single Exit Network and 12% over a confidence-based Early Exit version. Such efficiency gains enable real-time radar data processing on resource-constrained platforms, allowing for new applications in the context of smart homes, Internet-of-Things, and human-computer interaction.
arXiv Detail & Related papers (2023-09-11T12:38:01Z)
Computation-efficient Deep Learning for Computer Vision: A Survey [121.84121397440337]
Deep learning models have reached or even exceeded human-level performance in a range of visual perception tasks. Deep learning models usually demand significant computational resources, leading to impractical power consumption, latency, or carbon emissions in real-world scenarios. New research focus is computationally efficient deep learning, which strives to achieve satisfactory performance while minimizing the computational cost during inference.
arXiv Detail & Related papers (2023-08-27T03:55:28Z)
Learnability with Time-Sharing Computational Resource Concerns [65.268245109828]
We present a theoretical framework that takes into account the influence of computational resources in learning theory. This framework can be naturally applied to stream learning where the incoming data streams can be potentially endless. It may also provide a theoretical perspective for the design of intelligent supercomputing operating systems.
arXiv Detail & Related papers (2023-05-03T15:54:23Z)
Enable Deep Learning on Mobile Devices: Methods, Systems, and Applications [46.97774949613859]
Deep neural networks (DNNs) have achieved unprecedented success in the field of artificial intelligence (AI) However, their superior performance comes at the considerable cost of computational complexity. This paper provides an overview of efficient deep learning methods, systems and applications.
arXiv Detail & Related papers (2022-04-25T16:52:48Z)
Benchmarking Resource Usage for Efficient Distributed Deep Learning [10.869092085691687]
We conduct over 3,400 experiments training an array of deep networks representing various domains/tasks. We fit power law models that describe how training time scales with available compute resources and energy constraints.
arXiv Detail & Related papers (2022-01-28T21:24:15Z)
Resource-Efficient Deep Learning: A Survey on Model-, Arithmetic-, and Implementation-Level Techniques [10.715525749057495]
Deep learning is pervasive in our daily life, including self-driving cars, virtual assistants, social network services, healthcare services, face recognition, etc. Deep neural networks demand substantial compute resources during training and inference. This article provides a survey on resource-efficient deep learning techniques in terms of model-, arithmetic-, and implementation-level techniques.
arXiv Detail & Related papers (2021-12-30T17:00:06Z)
Machine Learning for Massive Industrial Internet of Things [69.52379407906017]
Industrial Internet of Things (IIoT) revolutionizes the future manufacturing facilities by integrating the Internet of Things technologies into industrial settings. With the deployment of massive IIoT devices, it is difficult for the wireless network to support the ubiquitous connections with diverse quality-of-service (QoS) requirements. We first summarize the requirements of the typical massive non-critical and critical IIoT use cases. We then identify unique characteristics in the massive IIoT scenario, and the corresponding machine learning solutions with its limitations and potential research directions.
arXiv Detail & Related papers (2021-03-10T20:10:53Z)
Spiking Neural Networks Hardware Implementations and Challenges: a Survey [53.429871539789445]
Spiking Neural Networks are cognitive algorithms mimicking neuron and synapse operational principles. We present the state of the art of hardware implementations of spiking neural networks. We discuss the strategies employed to leverage the characteristics of these event-driven algorithms at the hardware level.
arXiv Detail & Related papers (2020-05-04T13:24:00Z)
Deep Learning for Ultra-Reliable and Low-Latency Communications in 6G Networks [84.2155885234293]
We first summarize how to apply data-driven supervised deep learning and deep reinforcement learning in URLLC. To address these open problems, we develop a multi-level architecture that enables device intelligence, edge intelligence, and cloud intelligence for URLLC.
arXiv Detail & Related papers (2020-02-22T14:38:11Z)

This list is automatically generated from the titles and abstracts of the papers in this site.