Related papers: Efficient Heterogeneous Video Segmentation at the Edge

Efficient Heterogeneous Video Segmentation at the Edge

URL: http://arxiv.org/abs/2208.11666v1
Date: Wed, 24 Aug 2022 17:01:09 GMT
Title: Efficient Heterogeneous Video Segmentation at the Edge
Authors: Jamie Menjay Lin, Siargey Pisarchyk, Juhyun Lee, David Tian, Tingbo Hou, Karthik Raveendran, Raman Sarokin, George Sung, Trent Tolley, Matthias Grundmann
Abstract summary: We introduce an efficient video segmentation system for resource-limited edge devices leveraging heterogeneous compute. Specifically, we design network models by searching across multiple dimensions of specifications for the neural architectures. We analyze and optimize the heterogeneous data flows in our systems across the CPU, the GPU and the NPU.
Score: 2.4378845585726903
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We introduce an efficient video segmentation system for resource-limited edge devices leveraging heterogeneous compute. Specifically, we design network models by searching across multiple dimensions of specifications for the neural architectures and operations on top of already light-weight backbones, targeting commercially available edge inference engines. We further analyze and optimize the heterogeneous data flows in our systems across the CPU, the GPU and the NPU. Our approach has empirically factored well into our real-time AR system, enabling remarkably higher accuracy with quadrupled effective resolutions, yet at much shorter end-to-end latency, much higher frame rate, and even lower power consumption on edge platforms.

Related papers

Accelerating Linear Recurrent Neural Networks for the Edge with Unstructured Sparsity [39.483346492111515]
Linear recurrent neural networks enable powerful long-range sequence modeling with constant memory usage and time-per-token during inference. Unstructured sparsity offers a compelling solution, enabling substantial reductions in compute and memory requirements when accelerated by compatible hardware platforms. We find that highly sparse linear RNNs consistently achieve better efficiency-performance trade-offs than dense baselines.
arXiv Detail & Related papers (2025-02-03T13:09:21Z)
Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment [61.20689382879937]
Task-oriented edge computing addresses this by shifting data analysis to the edge. Existing methods struggle to balance high model performance with low resource consumption. We propose a novel co-design framework to optimize neural network architecture.
arXiv Detail & Related papers (2024-10-29T19:02:54Z)
Mondrian: On-Device High-Performance Video Analytics with Compressive Packed Inference [7.624476059109304]
Mondrian is an edge system that enables high-performance object detection on high-resolution video streams. We devise a novel Compressive Packed Inference to minimize per-pixel processing costs.
arXiv Detail & Related papers (2024-03-12T12:35:12Z)
Data-Model-Circuit Tri-Design for Ultra-Light Video Intelligence on Edge Devices [90.30316433184414]
We propose a data-model-hardware tri-design framework for high- throughput, low-cost, and high-accuracy MOT on HD video stream. Compared to the state-of-the-art MOT baseline, our tri-design approach can achieve 12.5x latency reduction, 20.9x effective frame rate improvement, 5.83x lower power, and 9.78x better energy efficiency, without much accuracy drop.
arXiv Detail & Related papers (2022-10-16T16:21:40Z)
Streaming Video Analytics On The Edge With Asynchronous Cloud Support [2.7456483236562437]
We propose a novel edge-cloud fusion algorithm that fuses edge and cloud predictions, achieving low latency and high accuracy. We focus on object detection in videos (applicable in many video analytics scenarios) and show that the fused edge-cloud predictions can outperform the accuracy of edge-only and cloud-only scenarios by as much as 50%.
arXiv Detail & Related papers (2022-10-04T06:22:13Z)
Generative Adversarial Super-Resolution at the Edge with Knowledge Distillation [1.3764085113103222]
Single-Image Super-Resolution can support robotic tasks in environments where a reliable visual stream is required. We propose an efficient Generative Adversarial Network model for real-time Super-Resolution, called EdgeSRGAN.
arXiv Detail & Related papers (2022-09-07T10:58:41Z)
Turbo: Opportunistic Enhancement for Edge Video Analytics [15.528497833853146]
We study the problem of opportunistic data enhancement using the non-deterministic and fragmented idle GPU resources. We propose a task-specific discrimination and enhancement module and a model-aware adversarial training mechanism. Our system boosts object detection accuracy by $7.3-11.3%$ without incurring any latency costs.
arXiv Detail & Related papers (2022-06-29T12:13:30Z)
MAPLE-Edge: A Runtime Latency Predictor for Edge Devices [80.01591186546793]
We propose MAPLE-Edge, an edge device-oriented extension of MAPLE, the state-of-the-art latency predictor for general purpose hardware. Compared to MAPLE, MAPLE-Edge can describe the runtime and target device platform using a much smaller set of CPU performance counters. We also demonstrate that unlike MAPLE which performs best when trained on a pool of devices sharing a common runtime, MAPLE-Edge can effectively generalize across runtimes.
arXiv Detail & Related papers (2022-04-27T14:00:48Z)
Multi-Exit Semantic Segmentation Networks [78.44441236864057]
We propose a framework for converting state-of-the-art segmentation models to MESS networks. specially trained CNNs that employ parametrised early exits along their depth to save during inference on easier samples. We co-optimise the number, placement and architecture of the attached segmentation heads, along with the exit policy, to adapt to the device capabilities and application-specific requirements.
arXiv Detail & Related papers (2021-06-07T11:37:03Z)
Real-time Semantic Segmentation with Fast Attention [94.88466483540692]
We propose a novel architecture for semantic segmentation of high-resolution images and videos in real-time. The proposed architecture relies on our fast spatial attention, which is a simple yet efficient modification of the popular self-attention mechanism. We show that results on multiple datasets demonstrate superior performance with better accuracy and speed compared to existing approaches.
arXiv Detail & Related papers (2020-07-07T22:37:16Z)
Deep Space-Time Video Upsampling Networks [47.62807427163614]
Video super-resolution (VSR) and frame (FI) are traditional computer vision problems. We propose an end-to-end framework for the space-time video upsampling by efficiently merging VSR and FI into a joint framework. Results show better results both quantitatively and qualitatively, while reducing the time (x7 faster) and the number of parameters (30%) compared to baselines.
arXiv Detail & Related papers (2020-04-06T07:04:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.