Communication-Computation Trade-Off in Resource-Constrained Edge
Inference
- URL: http://arxiv.org/abs/2006.02166v2
- Date: Wed, 14 Oct 2020 11:54:28 GMT
- Title: Communication-Computation Trade-Off in Resource-Constrained Edge
Inference
- Authors: Jiawei Shao, Jun Zhang
- Abstract summary: This article presents effective methods for edge inference at resource-constrained devices.
It focuses on device-edge co-inference, assisted by an edge computing server.
A three-step framework is proposed for the effective inference.
- Score: 5.635540684037595
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The recent breakthrough in artificial intelligence (AI), especially deep
neural networks (DNNs), has affected every branch of science and technology.
Particularly, edge AI has been envisioned as a major application scenario to
provide DNN-based services at edge devices. This article presents effective
methods for edge inference at resource-constrained devices. It focuses on
device-edge co-inference, assisted by an edge computing server, and
investigates a critical trade-off among the computation cost of the on-device
model and the communication cost of forwarding the intermediate feature to the
edge server. A three-step framework is proposed for the effective inference:
(1) model split point selection to determine the on-device model, (2)
communication-aware model compression to reduce the on-device computation and
the resulting communication overhead simultaneously, and (3) task-oriented
encoding of the intermediate feature to further reduce the communication
overhead. Experiments demonstrate that our proposed framework achieves a better
trade-off and significantly reduces the inference latency than baseline
methods.
Related papers
- Heterogeneity-Aware Resource Allocation and Topology Design for Hierarchical Federated Edge Learning [9.900317349372383]
Federated Learning (FL) provides a privacy-preserving framework for training machine learning models on mobile edge devices.
Traditional FL algorithms, e.g., FedAvg, impose a heavy communication workload on these devices.
We propose a two-tier HFEL system, where edge devices are connected to edge servers and edge servers are interconnected through peer-to-peer (P2P) edge backhauls.
Our goal is to enhance the training efficiency of the HFEL system through strategic resource allocation and topology design.
arXiv Detail & Related papers (2024-09-29T01:48:04Z) - Robust Communication and Computation using Deep Learning via Joint Uncertainty Injection [15.684142238738797]
The convergence of communication and computation, along with the integration of machine learning and artificial intelligence, stand as key empowering pillars for the sixth-generation of communication systems (6G)
This paper considers a network of one base station serving a number of devices simultaneously using spatial multiplexing.
The paper then presents an innovative deep learning-based approach to simultaneously manage the transmit and computing powers, alongside allocation, amidst uncertainties in both channel and computing states information.
arXiv Detail & Related papers (2024-06-05T18:00:09Z) - Adaptive Early Exiting for Collaborative Inference over Noisy Wireless
Channels [17.890390892890057]
Collaborative inference systems are one of the emerging solutions for deploying deep neural networks (DNNs) at the wireless network edge.
In this work, we study early exiting in the context of collaborative inference, which allows obtaining inference results at the edge device for certain samples.
The central part of our system is the transmission-decision (TD) mechanism, which decides whether to keep the early exit prediction or transmit the data to the edge server for further processing.
arXiv Detail & Related papers (2023-11-29T21:31:59Z) - Task-Oriented Sensing, Computation, and Communication Integration for
Multi-Device Edge AI [108.08079323459822]
This paper studies a new multi-intelligent edge artificial-latency (AI) system, which jointly exploits the AI model split inference and integrated sensing and communication (ISAC)
We measure the inference accuracy by adopting an approximate but tractable metric, namely discriminant gain.
arXiv Detail & Related papers (2022-07-03T06:57:07Z) - An Adaptive Device-Edge Co-Inference Framework Based on Soft
Actor-Critic [72.35307086274912]
High-dimension parameter model and large-scale mathematical calculation restrict execution efficiency, especially for Internet of Things (IoT) devices.
We propose a new Deep Reinforcement Learning (DRL)-Soft Actor Critic for discrete (SAC-d), which generates the emphexit point, emphexit point, and emphcompressing bits by soft policy iterations.
Based on the latency and accuracy aware reward design, such an computation can well adapt to the complex environment like dynamic wireless channel and arbitrary processing, and is capable of supporting the 5G URL
arXiv Detail & Related papers (2022-01-09T09:31:50Z) - Communication-Efficient Separable Neural Network for Distributed
Inference on Edge Devices [2.28438857884398]
We propose a novel method of exploiting model parallelism to separate a neural network for distributed inferences.
Under proper specifications of devices and configurations of models, our experiments show that the inference of large neural networks on edge clusters can be distributed and accelerated.
arXiv Detail & Related papers (2021-11-03T19:30:28Z) - Computational Intelligence and Deep Learning for Next-Generation
Edge-Enabled Industrial IoT [51.68933585002123]
We investigate how to deploy computational intelligence and deep learning (DL) in edge-enabled industrial IoT networks.
In this paper, we propose a novel multi-exit-based federated edge learning (ME-FEEL) framework.
In particular, the proposed ME-FEEL can achieve an accuracy gain up to 32.7% in the industrial IoT networks with the severely limited resources.
arXiv Detail & Related papers (2021-10-28T08:14:57Z) - Communication-Computation Efficient Device-Edge Co-Inference via AutoML [4.06604174802643]
Device-edge co-inference partitions a deep neural network between a resource-constrained mobile device and an edge server.
On-device model sparsity level and intermediate feature compression ratio have direct impacts on workload and communication overhead.
We propose a novel automated machine learning (AutoML) framework based on deep reinforcement learning (DRL)
arXiv Detail & Related papers (2021-08-30T06:36:30Z) - Accelerating Federated Edge Learning via Optimized Probabilistic Device
Scheduling [57.271494741212166]
This paper formulates and solves the communication time minimization problem.
It is found that the optimized policy gradually turns its priority from suppressing the remaining communication rounds to reducing per-round latency as the training process evolves.
The effectiveness of the proposed scheme is demonstrated via a use case on collaborative 3D objective detection in autonomous driving.
arXiv Detail & Related papers (2021-07-24T11:39:17Z) - Towards AIOps in Edge Computing Environments [60.27785717687999]
This paper describes the system design of an AIOps platform which is applicable in heterogeneous, distributed environments.
It is feasible to collect metrics with a high frequency and simultaneously run specific anomaly detection algorithms directly on edge devices.
arXiv Detail & Related papers (2021-02-12T09:33:00Z) - Reconfigurable Intelligent Surface Assisted Mobile Edge Computing with
Heterogeneous Learning Tasks [53.1636151439562]
Mobile edge computing (MEC) provides a natural platform for AI applications.
We present an infrastructure to perform machine learning tasks at an MEC with the assistance of a reconfigurable intelligent surface (RIS)
Specifically, we minimize the learning error of all participating users by jointly optimizing transmit power of mobile users, beamforming vectors of the base station, and the phase-shift matrix of the RIS.
arXiv Detail & Related papers (2020-12-25T07:08:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.