Related papers: Adaptive Early Exiting for Collaborative Inference over Noisy Wireless Channels

Adaptive Early Exiting for Collaborative Inference over Noisy Wireless Channels

URL: http://arxiv.org/abs/2311.18098v1
Date: Wed, 29 Nov 2023 21:31:59 GMT
Title: Adaptive Early Exiting for Collaborative Inference over Noisy Wireless Channels
Authors: Mikolaj Jankowski, Deniz Gunduz, Krystian Mikolajczyk
Abstract summary: Collaborative inference systems are one of the emerging solutions for deploying deep neural networks (DNNs) at the wireless network edge. In this work, we study early exiting in the context of collaborative inference, which allows obtaining inference results at the edge device for certain samples. The central part of our system is the transmission-decision (TD) mechanism, which decides whether to keep the early exit prediction or transmit the data to the edge server for further processing.
Score: 17.890390892890057
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Collaborative inference systems are one of the emerging solutions for deploying deep neural networks (DNNs) at the wireless network edge. Their main idea is to divide a DNN into two parts, where the first is shallow enough to be reliably executed at edge devices of limited computational power, while the second part is executed at an edge server with higher computational capabilities. The main advantage of such systems is that the input of the DNN gets compressed as the subsequent layers of the shallow part extract only the information necessary for the task. As a result, significant communication savings can be achieved compared to transmitting raw input samples. In this work, we study early exiting in the context of collaborative inference, which allows obtaining inference results at the edge device for certain samples, without the need to transmit the partially processed data to the edge server at all, leading to further communication savings. The central part of our system is the transmission-decision (TD) mechanism, which, given the information from the early exit, and the wireless channel conditions, decides whether to keep the early exit prediction or transmit the data to the edge server for further processing. In this paper, we evaluate various TD mechanisms and show experimentally, that for an image classification task over the wireless edge, proper utilization of early exits can provide both performance gains and significant communication savings.

Related papers

DistrEE: Distributed Early Exit of Deep Neural Network Inference on Edge Devices [13.916010072536377]
We propose DistrEE, a distributed DNN inference framework that can exit model inference early to meet quality of service requirements. We show that DistrEE can efficiently realize efficient collaborative inference, achieving an effective trade-off between inference latency and accuracy.
arXiv Detail & Related papers (2025-02-06T09:16:54Z)
Neuromorphic Wireless Split Computing with Multi-Level Spikes [69.73249913506042]
In neuromorphic computing, spiking neural networks (SNNs) perform inference tasks, offering significant efficiency gains for workloads involving sequential data. Recent advances in hardware and software have demonstrated that embedding a few bits of payload in each spike exchanged between the spiking neurons can further enhance inference accuracy. This paper investigates a wireless neuromorphic split computing architecture employing multi-level SNNs.
arXiv Detail & Related papers (2024-11-07T14:08:35Z)
Edge-device Collaborative Computing for Multi-view Classification [9.047284788663776]
We explore collaborative inference at the edge, in which edge nodes and end devices share correlated data and the inference computational burden. We introduce selective schemes that decrease bandwidth resource consumption by effectively reducing data redundancy. Experimental results highlight that selective collaborative schemes can achieve different trade-offs between the above performance metrics.
arXiv Detail & Related papers (2024-09-24T11:07:33Z)
Neuromorphic Split Computing with Wake-Up Radios: Architecture and Design via Digital Twinning [97.99077847606624]
This work proposes a novel architecture that integrates a wake-up radio mechanism within a split computing system consisting of remote, wirelessly connected, NPUs. A key challenge in the design of a wake-up radio-based neuromorphic split computing system is the selection of thresholds for sensing, wake-up signal detection, and decision making.
arXiv Detail & Related papers (2024-04-02T10:19:04Z)
Neuromorphic Wireless Device-Edge Co-Inference via the Directed Information Bottleneck [40.44060856946713]
Device-edge co-inference is where a semantic task is partitioned between a device and an edge server. We introduce a new system solution, termed neuromorphic wireless device-edge co-inference. The proposed system is designed using a transmitter-centric information-theoretic criterion that targets a reduction of the communication overhead.
arXiv Detail & Related papers (2024-04-02T10:06:21Z)
Attention-based Feature Compression for CNN Inference Offloading in Edge Computing [93.67044879636093]
This paper studies the computational offloading of CNN inference in device-edge co-inference systems. We propose a novel autoencoder-based CNN architecture (AECNN) for effective feature extraction at end-device. Experiments show that AECNN can compress the intermediate data by more than 256x with only about 4% accuracy loss.
arXiv Detail & Related papers (2022-11-24T18:10:01Z)
Bandwidth-efficient distributed neural network architectures with application to body sensor networks [73.02174868813475]
This paper describes a conceptual design methodology to design distributed neural network architectures. We show that the proposed framework enables up to a factor 20 in bandwidth reduction with minimal loss. While the application focus of this paper is on wearable brain-computer interfaces, the proposed methodology can be applied in other sensor network-like applications as well.
arXiv Detail & Related papers (2022-10-14T12:35:32Z)
Dynamic Split Computing for Efficient Deep Edge Intelligence [78.4233915447056]
We introduce dynamic split computing, where the optimal split location is dynamically selected based on the state of the communication channel. We show that dynamic split computing achieves faster inference in edge computing environments where the data rate and server load vary over time.
arXiv Detail & Related papers (2022-05-23T12:35:18Z)
Computational Intelligence and Deep Learning for Next-Generation Edge-Enabled Industrial IoT [51.68933585002123]
We investigate how to deploy computational intelligence and deep learning (DL) in edge-enabled industrial IoT networks. In this paper, we propose a novel multi-exit-based federated edge learning (ME-FEEL) framework. In particular, the proposed ME-FEEL can achieve an accuracy gain up to 32.7% in the industrial IoT networks with the severely limited resources.
arXiv Detail & Related papers (2021-10-28T08:14:57Z)
Neural Compression and Filtering for Edge-assisted Real-time Object Detection in Challenged Networks [8.291242737118482]
We focus on edge computing supporting remote object detection by means of Deep Neural Networks (DNNs) We develop a framework to reduce the amount of data transmitted over the wireless link. The proposed technique represents an effective intermediate option between local and edge computing in a parameter region.
arXiv Detail & Related papers (2020-07-31T03:11:46Z)
Communication-Computation Trade-Off in Resource-Constrained Edge Inference [5.635540684037595]
This article presents effective methods for edge inference at resource-constrained devices. It focuses on device-edge co-inference, assisted by an edge computing server. A three-step framework is proposed for the effective inference.
arXiv Detail & Related papers (2020-06-03T11:00:32Z)
Joint Device-Edge Inference over Wireless Links with Pruning [20.45405359815043]
We propose a joint feature compression and transmission scheme for efficient inference at the wireless network edge. This is the first work that combines DeepJSCC with network pruning, and applies it to image classification over the wireless edge.
arXiv Detail & Related papers (2020-03-04T12:06:11Z)

This list is automatically generated from the titles and abstracts of the papers in this site.