Attention-based Feature Compression for CNN Inference Offloading in Edge
Computing
- URL: http://arxiv.org/abs/2211.13745v1
- Date: Thu, 24 Nov 2022 18:10:01 GMT
- Title: Attention-based Feature Compression for CNN Inference Offloading in Edge
Computing
- Authors: Nan Li, Alexandros Iosifidis, Qi Zhang
- Abstract summary: This paper studies the computational offloading of CNN inference in device-edge co-inference systems.
We propose a novel autoencoder-based CNN architecture (AECNN) for effective feature extraction at end-device.
Experiments show that AECNN can compress the intermediate data by more than 256x with only about 4% accuracy loss.
- Score: 93.67044879636093
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper studies the computational offloading of CNN inference in
device-edge co-inference systems. Inspired by the emerging paradigm semantic
communication, we propose a novel autoencoder-based CNN architecture (AECNN),
for effective feature extraction at end-device. We design a feature compression
module based on the channel attention method in CNN, to compress the
intermediate data by selecting the most important features. To further reduce
communication overhead, we can use entropy encoding to remove the statistical
redundancy in the compressed data. At the receiver, we design a lightweight
decoder to reconstruct the intermediate data through learning from the received
compressed data to improve accuracy. To fasten the convergence, we use a
step-by-step approach to train the neural networks obtained based on ResNet-50
architecture. Experimental results show that AECNN can compress the
intermediate data by more than 256x with only about 4% accuracy loss, which
outperforms the state-of-the-art work, BottleNet++. Compared to offloading
inference task directly to edge server, AECNN can complete inference task
earlier, in particular, under poor wireless channel condition, which highlights
the effectiveness of AECNN in guaranteeing higher accuracy within time
constraint.
Related papers
- Dynamic Semantic Compression for CNN Inference in Multi-access Edge
Computing: A Graph Reinforcement Learning-based Autoencoder [82.8833476520429]
We propose a novel semantic compression method, autoencoder-based CNN architecture (AECNN) for effective semantic extraction and compression in partial offloading.
In the semantic encoder, we introduce a feature compression module based on the channel attention mechanism in CNNs, to compress intermediate data by selecting the most informative features.
In the semantic decoder, we design a lightweight decoder to reconstruct the intermediate data through learning from the received compressed data to improve accuracy.
arXiv Detail & Related papers (2024-01-19T15:19:47Z) - Spatiotemporal Attention-based Semantic Compression for Real-time Video
Recognition [117.98023585449808]
We propose a temporal attention-based autoencoder (STAE) architecture to evaluate the importance of frames and pixels in each frame.
We develop a lightweight decoder that leverages a 3D-2D CNN combined to reconstruct missing information.
Experimental results show that ViT_STAE can compress the video dataset H51 by 104x with only 5% accuracy loss.
arXiv Detail & Related papers (2023-05-22T07:47:27Z) - Semantic Communication Enabling Robust Edge Intelligence for
Time-Critical IoT Applications [87.05763097471487]
This paper aims to design robust Edge Intelligence using semantic communication for time-critical IoT applications.
We analyze the effect of image DCT coefficients on inference accuracy and propose the channel-agnostic effectiveness encoding for offloading.
arXiv Detail & Related papers (2022-11-24T20:13:17Z) - Receptive Field-based Segmentation for Distributed CNN Inference
Acceleration in Collaborative Edge Computing [93.67044879636093]
We study inference acceleration using distributed convolutional neural networks (CNNs) in collaborative edge computing network.
We propose a novel collaborative edge computing using fused-layer parallelization to partition a CNN model into multiple blocks of convolutional layers.
arXiv Detail & Related papers (2022-07-22T18:38:11Z) - An Adaptive Device-Edge Co-Inference Framework Based on Soft
Actor-Critic [72.35307086274912]
High-dimension parameter model and large-scale mathematical calculation restrict execution efficiency, especially for Internet of Things (IoT) devices.
We propose a new Deep Reinforcement Learning (DRL)-Soft Actor Critic for discrete (SAC-d), which generates the emphexit point, emphexit point, and emphcompressing bits by soft policy iterations.
Based on the latency and accuracy aware reward design, such an computation can well adapt to the complex environment like dynamic wireless channel and arbitrary processing, and is capable of supporting the 5G URL
arXiv Detail & Related papers (2022-01-09T09:31:50Z) - CondenseNeXt: An Ultra-Efficient Deep Neural Network for Embedded
Systems [0.0]
A Convolutional Neural Network (CNN) is a class of Deep Neural Network (DNN) widely used in the analysis of visual images captured by an image sensor.
In this paper, we propose a neoteric variant of deep convolutional neural network architecture to ameliorate the performance of existing CNN architectures for real-time inference on embedded systems.
arXiv Detail & Related papers (2021-12-01T18:20:52Z) - EffCNet: An Efficient CondenseNet for Image Classification on NXP
BlueBox [0.0]
Edge devices offer limited processing power due to their inexpensive hardware, and limited cooling and computational resources.
We propose a novel deep convolutional neural network architecture called EffCNet for edge devices.
arXiv Detail & Related papers (2021-11-28T21:32:31Z) - Toward Compact Parameter Representations for Architecture-Agnostic
Neural Network Compression [26.501979992447605]
This paper investigates compression from the perspective of compactly representing and storing trained parameters.
We leverage additive quantization, an extreme lossy compression method invented for image descriptors, to compactly represent the parameters.
We conduct experiments on MobileNet-v2, VGG-11, ResNet-50, Feature Pyramid Networks, and pruned DNNs trained for classification, detection, and segmentation tasks.
arXiv Detail & Related papers (2021-11-19T17:03:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.