Shoggoth: Towards Efficient Edge-Cloud Collaborative Real-Time Video
Inference via Adaptive Online Learning
- URL: http://arxiv.org/abs/2306.15333v1
- Date: Tue, 27 Jun 2023 09:39:42 GMT
- Title: Shoggoth: Towards Efficient Edge-Cloud Collaborative Real-Time Video
Inference via Adaptive Online Learning
- Authors: Liang Wang and Kai Lu and Nan Zhang and Xiaoyang Qu and Jianzong Wang
and Jiguang Wan and Guokuan Li and Jing Xiao
- Abstract summary: Shoggoth is an efficient edge-cloud collaborative architecture for boosting inference performance on real-time video of changing scenes.
Online knowledge distillation improves the accuracy of models suffering from data drift and offloads the labeling process to the cloud.
At the edge, we design adaptive training using small batches to adapt models under limited computing power.
- Score: 33.16911236522438
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper proposes Shoggoth, an efficient edge-cloud collaborative
architecture, for boosting inference performance on real-time video of changing
scenes. Shoggoth uses online knowledge distillation to improve the accuracy of
models suffering from data drift and offloads the labeling process to the
cloud, alleviating constrained resources of edge devices. At the edge, we
design adaptive training using small batches to adapt models under limited
computing power, and adaptive sampling of training frames for robustness and
reducing bandwidth. The evaluations on the realistic dataset show 15%-20% model
accuracy improvement compared to the edge-only strategy and fewer network costs
than the cloud-only strategy.
Related papers
- Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment [61.20689382879937]
Task-oriented edge computing addresses this by shifting data analysis to the edge.
Existing methods struggle to balance high model performance with low resource consumption.
We propose a novel co-design framework to optimize neural network architecture.
arXiv Detail & Related papers (2024-10-29T19:02:54Z) - EdgeSync: Faster Edge-model Updating via Adaptive Continuous Learning for Video Data Drift [7.165359653719119]
Real-time video analytics systems typically place models with fewer weights on edge devices to reduce latency.
The distribution of video content features may change over time, leading to accuracy degradation of existing models.
Recent work proposes a framework that uses a remote server to continually train and adapt the lightweight model at edge with the help of complex model.
arXiv Detail & Related papers (2024-06-05T07:06:26Z) - Towards Robust and Efficient Cloud-Edge Elastic Model Adaptation via Selective Entropy Distillation [56.79064699832383]
We establish a Cloud-Edge Elastic Model Adaptation (CEMA) paradigm in which the edge models only need to perform forward propagation.
In our CEMA, to reduce the communication burden, we devise two criteria to exclude unnecessary samples from uploading to the cloud.
arXiv Detail & Related papers (2024-02-27T08:47:19Z) - Small Dataset, Big Gains: Enhancing Reinforcement Learning by Offline
Pre-Training with Model Based Augmentation [59.899714450049494]
offline pre-training can produce sub-optimal policies and lead to degraded online reinforcement learning performance.
We propose a model-based data augmentation strategy to maximize the benefits of offline reinforcement learning pre-training and reduce the scale of data needed to be effective.
arXiv Detail & Related papers (2023-12-15T14:49:41Z) - Adaptive-Labeling for Enhancing Remote Sensing Cloud Understanding [40.572147431473034]
We introduce an innovative model-agnostic Cloud Adaptive-Labeling (CAL) approach, which operates iteratively to enhance the quality of training data annotations.
Our methodology commences by training a cloud segmentation model using the original annotations.
It introduces a trainable pixel intensity threshold for adaptively labeling the cloud training images on the fly.
The newly generated labels are then employed to fine-tune the model.
arXiv Detail & Related papers (2023-11-09T08:23:45Z) - Towards Compute-Optimal Transfer Learning [82.88829463290041]
We argue that zero-shot structured pruning of pretrained models allows them to increase compute efficiency with minimal reduction in performance.
Our results show that pruning convolutional filters of pretrained models can lead to more than 20% performance improvement in low computational regimes.
arXiv Detail & Related papers (2023-04-25T21:49:09Z) - An Efficient Split Fine-tuning Framework for Edge and Cloud
Collaborative Learning [20.118073642453034]
We design an efficient split fine-tuning framework for edge and cloud collaborative learning.
We compress the intermediate output of a neural network to reduce the communication volume between the edge device and the cloud server.
Our framework can reduce the communication traffic by 96 times with little impact on the model accuracy.
arXiv Detail & Related papers (2022-11-30T02:55:21Z) - LegoNet: A Fast and Exact Unlearning Architecture [59.49058450583149]
Machine unlearning aims to erase the impact of specific training samples upon deleted requests from a trained model.
We present a novel network, namely textitLegoNet, which adopts the framework of fixed encoder + multiple adapters''
We show that LegoNet accomplishes fast and exact unlearning while maintaining acceptable performance, synthetically outperforming unlearning baselines.
arXiv Detail & Related papers (2022-10-28T09:53:05Z) - Streaming Video Analytics On The Edge With Asynchronous Cloud Support [2.7456483236562437]
We propose a novel edge-cloud fusion algorithm that fuses edge and cloud predictions, achieving low latency and high accuracy.
We focus on object detection in videos (applicable in many video analytics scenarios) and show that the fused edge-cloud predictions can outperform the accuracy of edge-only and cloud-only scenarios by as much as 50%.
arXiv Detail & Related papers (2022-10-04T06:22:13Z) - Online Convolutional Re-parameterization [51.97831675242173]
We present online convolutional re- parameterization (OREPA), a two-stage pipeline, aiming to reduce the huge training overhead by squeezing the complex training-time block into a single convolution.
Compared with the state-of-the-art re-param models, OREPA is able to save the training-time memory cost by about 70% and accelerate the training speed by around 2x.
We also conduct experiments on object detection and semantic segmentation and show consistent improvements on the downstream tasks.
arXiv Detail & Related papers (2022-04-02T09:50:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.