Streaming Video Analytics On The Edge With Asynchronous Cloud Support
- URL: http://arxiv.org/abs/2210.01402v1
- Date: Tue, 4 Oct 2022 06:22:13 GMT
- Title: Streaming Video Analytics On The Edge With Asynchronous Cloud Support
- Authors: Anurag Ghosh, Srinivasan Iyengar, Stephen Lee, Anuj Rathore, Venkat N
Padmanabhan
- Abstract summary: We propose a novel edge-cloud fusion algorithm that fuses edge and cloud predictions, achieving low latency and high accuracy.
We focus on object detection in videos (applicable in many video analytics scenarios) and show that the fused edge-cloud predictions can outperform the accuracy of edge-only and cloud-only scenarios by as much as 50%.
- Score: 2.7456483236562437
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Emerging Internet of Things (IoT) and mobile computing applications are
expected to support latency-sensitive deep neural network (DNN) workloads. To
realize this vision, the Internet is evolving towards an edge-computing
architecture, where computing infrastructure is located closer to the end
device to help achieve low latency. However, edge computing may have limited
resources compared to cloud environments and thus, cannot run large DNN models
that often have high accuracy. In this work, we develop REACT, a framework that
leverages cloud resources to execute large DNN models with higher accuracy to
improve the accuracy of models running on edge devices. To do so, we propose a
novel edge-cloud fusion algorithm that fuses edge and cloud predictions,
achieving low latency and high accuracy. We extensively evaluate our approach
and show that our approach can significantly improve the accuracy compared to
baseline approaches. We focus specifically on object detection in videos
(applicable in many video analytics scenarios) and show that the fused
edge-cloud predictions can outperform the accuracy of edge-only and cloud-only
scenarios by as much as 50%. We also show that REACT can achieve good
performance across tradeoff points by choosing a wide range of system
parameters to satisfy use-case specific constraints, such as limited network
bandwidth or GPU cycles.
Related papers
- Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment [61.20689382879937]
Task-oriented edge computing addresses this by shifting data analysis to the edge.
Existing methods struggle to balance high model performance with low resource consumption.
We propose a novel co-design framework to optimize neural network architecture.
arXiv Detail & Related papers (2024-10-29T19:02:54Z) - Distributed Inference on Mobile Edge and Cloud: An Early Exit based Clustering Approach [5.402030962296633]
Deep Neural Networks (DNNs) have demonstrated outstanding performance across various domains.
A distributed inference setup can be used where a small-sized DNN can be deployed on mobile, a bigger version on the edge, and the full-fledged, on the cloud.
We develop a novel approach that utilizes Early Exit (EE) strategies developed to minimize inference latency in DNNs.
arXiv Detail & Related papers (2024-10-06T20:14:27Z) - EdgeNeXt: Efficiently Amalgamated CNN-Transformer Architecture for
Mobile Vision Applications [68.35683849098105]
We introduce split depth-wise transpose attention (SDTA) encoder that splits input tensors into multiple channel groups.
Our EdgeNeXt model with 1.3M parameters achieves 71.2% top-1 accuracy on ImageNet-1K.
Our EdgeNeXt model with 5.6M parameters achieves 79.4% top-1 accuracy on ImageNet-1K.
arXiv Detail & Related papers (2022-06-21T17:59:56Z) - MAPLE-Edge: A Runtime Latency Predictor for Edge Devices [80.01591186546793]
We propose MAPLE-Edge, an edge device-oriented extension of MAPLE, the state-of-the-art latency predictor for general purpose hardware.
Compared to MAPLE, MAPLE-Edge can describe the runtime and target device platform using a much smaller set of CPU performance counters.
We also demonstrate that unlike MAPLE which performs best when trained on a pool of devices sharing a common runtime, MAPLE-Edge can effectively generalize across runtimes.
arXiv Detail & Related papers (2022-04-27T14:00:48Z) - An Adaptive Device-Edge Co-Inference Framework Based on Soft
Actor-Critic [72.35307086274912]
High-dimension parameter model and large-scale mathematical calculation restrict execution efficiency, especially for Internet of Things (IoT) devices.
We propose a new Deep Reinforcement Learning (DRL)-Soft Actor Critic for discrete (SAC-d), which generates the emphexit point, emphexit point, and emphcompressing bits by soft policy iterations.
Based on the latency and accuracy aware reward design, such an computation can well adapt to the complex environment like dynamic wireless channel and arbitrary processing, and is capable of supporting the 5G URL
arXiv Detail & Related papers (2022-01-09T09:31:50Z) - EffCNet: An Efficient CondenseNet for Image Classification on NXP
BlueBox [0.0]
Edge devices offer limited processing power due to their inexpensive hardware, and limited cooling and computational resources.
We propose a novel deep convolutional neural network architecture called EffCNet for edge devices.
arXiv Detail & Related papers (2021-11-28T21:32:31Z) - POEM: 1-bit Point-wise Operations based on Expectation-Maximization for
Efficient Point Cloud Processing [53.74076015905961]
We introduce point-wise operations based on Expectation-Maximization into BNNs for efficient point cloud processing.
Our POEM surpasses existing the state-of-the-art binary point cloud networks by a significant margin, up to 6.7 %.
arXiv Detail & Related papers (2021-11-26T09:45:01Z) - Auto-Split: A General Framework of Collaborative Edge-Cloud AI [49.750972428032355]
This paper describes the techniques and engineering practice behind Auto-Split, an edge-cloud collaborative prototype of Huawei Cloud.
To the best of our knowledge, there is no existing industry product that provides the capability of Deep Neural Network (DNN) splitting.
arXiv Detail & Related papers (2021-08-30T08:03:29Z) - Latency-Memory Optimized Splitting of Convolution Neural Networks for
Resource Constrained Edge Devices [1.6873748786804317]
We argue that running CNNs between an edge device and the cloud is synonymous to solving a resource-constrained optimization problem.
Experiments done on real-world edge devices show that, LMOS ensures feasible execution of different CNN models at the edge.
arXiv Detail & Related papers (2021-07-19T19:39:56Z) - Towards Unsupervised Fine-Tuning for Edge Video Analytics [1.1091582432763736]
We propose a method for improving accuracy of edge models without any extra compute cost by means of automatic model specialization.
Results show that our method can automatically improve accuracy of pre-trained models by an average of 21%.
arXiv Detail & Related papers (2021-04-14T12:57:40Z) - A Serverless Cloud-Fog Platform for DNN-Based Video Analytics with
Incremental Learning [31.712746462418693]
This paper presents the first serverless system that takes full advantage of the client-fog-cloud synergy to better serve the DNN-based video analytics.
To this end, we implement a holistic cloud-fog system referred to as V (Video-Platform-as-a-Service)
The evaluation demonstrates that V is superior to several SOTA systems: it maintains high accuracy while reducing bandwidth usage by up to 21%, RTT by up to 62.5%, and cloud monetary cost by up to 50%.
arXiv Detail & Related papers (2021-02-05T05:59:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.