Feature Compression for Rate Constrained Object Detection on the Edge
- URL: http://arxiv.org/abs/2204.07314v1
- Date: Fri, 15 Apr 2022 03:39:30 GMT
- Title: Feature Compression for Rate Constrained Object Detection on the Edge
- Authors: Zhongzheng Yuan, Samyak Rawlekar, Siddharth Garg, Elza Erkip, Yao Wang
- Abstract summary: An emerging approach to solve this problem is to offload the computation of neural networks to computing resources at an edge server.
In this work, we consider a "split computation" system to offload a part of the computation of the YOLO object detection model.
We train the feature compression and decompression module together with the YOLO model to optimize the object detection accuracy under a rate constraint.
- Score: 20.18227104333772
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Recent advances in computer vision has led to a growth of interest in
deploying visual analytics model on mobile devices. However, most mobile
devices have limited computing power, which prohibits them from running large
scale visual analytics neural networks. An emerging approach to solve this
problem is to offload the computation of these neural networks to computing
resources at an edge server. Efficient computation offloading requires
optimizing the trade-off between multiple objectives including compressed data
rate, analytics performance, and computation speed. In this work, we consider a
"split computation" system to offload a part of the computation of the YOLO
object detection model. We propose a learnable feature compression approach to
compress the intermediate YOLO features with light-weight computation. We train
the feature compression and decompression module together with the YOLO model
to optimize the object detection accuracy under a rate constraint. Compared to
baseline methods that apply either standard image compression or learned image
compression at the mobile and perform image decompression and YOLO at the edge,
the proposed system achieves higher detection accuracy at the low to medium
rate range. Furthermore, the proposed system requires substantially lower
computation time on the mobile device with CPU only.
Related papers
- Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment [61.20689382879937]
Task-oriented edge computing addresses this by shifting data analysis to the edge.
Existing methods struggle to balance high model performance with low resource consumption.
We propose a novel co-design framework to optimize neural network architecture.
arXiv Detail & Related papers (2024-10-29T19:02:54Z) - Dynamic Semantic Compression for CNN Inference in Multi-access Edge
Computing: A Graph Reinforcement Learning-based Autoencoder [82.8833476520429]
We propose a novel semantic compression method, autoencoder-based CNN architecture (AECNN) for effective semantic extraction and compression in partial offloading.
In the semantic encoder, we introduce a feature compression module based on the channel attention mechanism in CNNs, to compress intermediate data by selecting the most informative features.
In the semantic decoder, we design a lightweight decoder to reconstruct the intermediate data through learning from the received compressed data to improve accuracy.
arXiv Detail & Related papers (2024-01-19T15:19:47Z) - Slimmable Encoders for Flexible Split DNNs in Bandwidth and Resource
Constrained IoT Systems [12.427821850039448]
We propose a novel split computing approach based on slimmable ensemble encoders.
The key advantage of our design is the ability to adapt computational load and transmitted data size in real-time with minimal overhead and time.
Our model outperforms existing solutions in terms of compression efficacy and execution time, especially in the context of weak mobile devices.
arXiv Detail & Related papers (2023-06-22T06:33:12Z) - Spatiotemporal Attention-based Semantic Compression for Real-time Video
Recognition [117.98023585449808]
We propose a temporal attention-based autoencoder (STAE) architecture to evaluate the importance of frames and pixels in each frame.
We develop a lightweight decoder that leverages a 3D-2D CNN combined to reconstruct missing information.
Experimental results show that ViT_STAE can compress the video dataset H51 by 104x with only 5% accuracy loss.
arXiv Detail & Related papers (2023-05-22T07:47:27Z) - Pushing the Limits of Asynchronous Graph-based Object Detection with
Event Cameras [62.70541164894224]
We introduce several architecture choices which allow us to scale the depth and complexity of such models while maintaining low computation.
Our method runs 3.7 times faster than a dense graph neural network, taking only 8.4 ms per forward pass.
arXiv Detail & Related papers (2022-11-22T15:14:20Z) - An Adaptive Device-Edge Co-Inference Framework Based on Soft
Actor-Critic [72.35307086274912]
High-dimension parameter model and large-scale mathematical calculation restrict execution efficiency, especially for Internet of Things (IoT) devices.
We propose a new Deep Reinforcement Learning (DRL)-Soft Actor Critic for discrete (SAC-d), which generates the emphexit point, emphexit point, and emphcompressing bits by soft policy iterations.
Based on the latency and accuracy aware reward design, such an computation can well adapt to the complex environment like dynamic wireless channel and arbitrary processing, and is capable of supporting the 5G URL
arXiv Detail & Related papers (2022-01-09T09:31:50Z) - FastFlowNet: A Lightweight Network for Fast Optical Flow Estimation [81.76975488010213]
Dense optical flow estimation plays a key role in many robotic vision tasks.
Current networks often occupy large number of parameters and require heavy computation costs.
Our proposed FastFlowNet works in the well-known coarse-to-fine manner with following innovations.
arXiv Detail & Related papers (2021-03-08T03:09:37Z) - Accelerating Deep Learning Applications in Space [0.0]
We investigate the performance of CNN-based object detectors on constrained devices.
We take a closer look at the Single Shot MultiBox Detector (SSD) and Region-based Fully Convolutional Network (R-FCN)
The performance is measured in terms of inference time, memory consumption, and accuracy.
arXiv Detail & Related papers (2020-07-21T21:06:30Z) - CNN Acceleration by Low-rank Approximation with Quantized Factors [9.654865591431593]
The modern convolutional neural networks although achieve great results in solving complex computer vision tasks still cannot be effectively used in mobile and embedded devices.
In order to solve this problem the novel approach combining two known methods, low-rank tensor approximation in Tucker format and quantization of weights and feature maps (activations) is proposed.
The efficiency of our method is demonstrated for ResNet18 and ResNet34 on CIFAR-10, CIFAR-100 and Imagenet classification tasks.
arXiv Detail & Related papers (2020-06-16T02:28:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.