Real-Time Video Inference on Edge Devices via Adaptive Model Streaming
- URL: http://arxiv.org/abs/2006.06628v2
- Date: Mon, 5 Apr 2021 23:29:53 GMT
- Title: Real-Time Video Inference on Edge Devices via Adaptive Model Streaming
- Authors: Mehrdad Khani, Pouya Hamadanian, Arash Nasr-Esfahany, Mohammad
Alizadeh
- Abstract summary: Real-time video inference on edge devices like mobile phones and drones is challenging due to the high cost of Deep Neural Networks.
We present Adaptive Model Streaming (AMS), a new approach to improving performance of efficient lightweight models for video inference on edge devices.
- Score: 9.101956442584251
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Real-time video inference on edge devices like mobile phones and drones is
challenging due to the high computation cost of Deep Neural Networks. We
present Adaptive Model Streaming (AMS), a new approach to improving performance
of efficient lightweight models for video inference on edge devices. AMS uses a
remote server to continually train and adapt a small model running on the edge
device, boosting its performance on the live video using online knowledge
distillation from a large, state-of-the-art model. We discuss the challenges of
over-the-network model adaptation for video inference, and present several
techniques to reduce communication cost of this approach: avoiding excessive
overfitting, updating a small fraction of important model parameters, and
adaptive sampling of training frames at edge devices. On the task of video
semantic segmentation, our experimental results show 0.4--17.8 percent mean
Intersection-over-Union improvement compared to a pre-trained model across
several video datasets. Our prototype can perform video segmentation at 30
frames-per-second with 40 milliseconds camera-to-label latency on a Samsung
Galaxy S10+ mobile phone, using less than 300 Kbps uplink and downlink
bandwidth on the device.
Related papers
- Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment [61.20689382879937]
Task-oriented edge computing addresses this by shifting data analysis to the edge.
Existing methods struggle to balance high model performance with low resource consumption.
We propose a novel co-design framework to optimize neural network architecture.
arXiv Detail & Related papers (2024-10-29T19:02:54Z) - EdgeSync: Faster Edge-model Updating via Adaptive Continuous Learning for Video Data Drift [7.165359653719119]
Real-time video analytics systems typically place models with fewer weights on edge devices to reduce latency.
The distribution of video content features may change over time, leading to accuracy degradation of existing models.
Recent work proposes a framework that uses a remote server to continually train and adapt the lightweight model at edge with the help of complex model.
arXiv Detail & Related papers (2024-06-05T07:06:26Z) - Arena: A Patch-of-Interest ViT Inference Acceleration System for Edge-Assisted Video Analytics [18.042752812489276]
We introduce an end-to-end edge-assisted video inference acceleration system based on Vision Transformer (ViT)
Our findings reveal that Arena can boost inference speeds by up to 1.58(times) and 1.82(times) on average while consuming only 47% and 31% of the bandwidth, respectively, all with high inference accuracy.
arXiv Detail & Related papers (2024-04-14T13:14:13Z) - Efficient Asynchronous Federated Learning with Sparsification and
Quantization [55.6801207905772]
Federated Learning (FL) is attracting more and more attention to collaboratively train a machine learning model without transferring raw data.
FL generally exploits a parameter server and a large number of edge devices during the whole process of the model training.
We propose TEASQ-Fed to exploit edge devices to asynchronously participate in the training process by actively applying for tasks.
arXiv Detail & Related papers (2023-12-23T07:47:07Z) - Towards High-Quality and Efficient Video Super-Resolution via
Spatial-Temporal Data Overfitting [27.302681897961588]
Deep convolutional neural networks (DNNs) are widely used in various fields of computer vision.
We propose a novel method for high-quality and efficient video resolution upscaling tasks.
We deploy our models on an off-the-shelf mobile phone, and experimental results show that our method achieves real-time video super-resolution with high video quality.
arXiv Detail & Related papers (2023-03-15T02:40:02Z) - Video Mobile-Former: Video Recognition with Efficient Global
Spatial-temporal Modeling [125.95527079960725]
Transformer-based models have achieved top performance on major video recognition benchmarks.
Video Mobile-Former is the first Transformer-based video model which constrains the computational budget within 1G FLOPs.
arXiv Detail & Related papers (2022-08-25T17:59:00Z) - Long-Short Temporal Contrastive Learning of Video Transformers [62.71874976426988]
Self-supervised pretraining of video transformers on video-only datasets can lead to action recognition results on par or better than those obtained with supervised pretraining on large-scale image datasets.
Our approach, named Long-Short Temporal Contrastive Learning, enables video transformers to learn an effective clip-level representation by predicting temporal context captured from a longer temporal extent.
arXiv Detail & Related papers (2021-06-17T02:30:26Z) - MoViNets: Mobile Video Networks for Efficient Video Recognition [52.49314494202433]
3D convolutional neural networks (CNNs) are accurate at video recognition but require large computation and memory budgets.
We propose a three-step approach to improve computational efficiency while substantially reducing the peak memory usage of 3D CNNs.
arXiv Detail & Related papers (2021-03-21T23:06:38Z) - RT3D: Achieving Real-Time Execution of 3D Convolutional Neural Networks
on Mobile Devices [57.877112704841366]
This paper proposes RT3D, a model compression and mobile acceleration framework for 3D CNNs.
For the first time, real-time execution of 3D CNNs is achieved on off-the-shelf mobiles.
arXiv Detail & Related papers (2020-07-20T02:05:32Z) - An On-Device Federated Learning Approach for Cooperative Model Update
between Edge Devices [2.99321624683618]
A neural-network based on-device learning approach is recently proposed, so that edge devices train incoming data at runtime to update their model.
In this paper, we focus on OS-ELM to sequentially train a model based on recent samples and combine it with autoencoder for anomaly detection.
We extend it for an on-device federated learning so that edge devices can exchange their trained results and update their model by using those collected from the other edge devices.
arXiv Detail & Related papers (2020-02-27T18:15:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.