Croesus: Multi-Stage Processing and Transactions for Video-Analytics in
Edge-Cloud Systems
- URL: http://arxiv.org/abs/2201.00063v1
- Date: Fri, 31 Dec 2021 21:38:05 GMT
- Title: Croesus: Multi-Stage Processing and Transactions for Video-Analytics in
Edge-Cloud Systems
- Authors: Samaa Gazzaz, Vishal Chakraborty, Faisal Nawab
- Abstract summary: Croesus is a multi-stage approach to edge-cloud systems that provides the ability to find the balance between accuracy and performance.
In this paper, we demonstrate the implications of such an approach on a video analytics use-case and show how multi-stage processing yields a better balance between accuracy and performance.
- Score: 2.9864637081333085
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Emerging edge applications require both a fast response latency and complex
processing. This is infeasible without expensive hardware that can process
complex operations -- such as object detection -- within a short time. Many
approach this problem by addressing the complexity of the models -- via model
compression, pruning and quantization -- or compressing the input. In this
paper, we propose a different perspective when addressing the performance
challenges. Croesus is a multi-stage approach to edge-cloud systems that
provides the ability to find the balance between accuracy and performance.
Croesus consists of two stages (that can be generalized to multiple stages): an
initial and a final stage. The initial stage performs the computation in
real-time using approximate/best-effort computation at the edge. The final
stage performs the full computation at the cloud, and uses the results to
correct any errors made at the initial stage. In this paper, we demonstrate the
implications of such an approach on a video analytics use-case and show how
multi-stage processing yields a better balance between accuracy and
performance. Moreover, we study the safety of multi-stage transactions via two
proposals: multi-stage serializability (MS-SR) and multi-stage invariant
confluence with Apologies (MS-IA).
Related papers
- One Diffusion to Generate Them All [54.82732533013014]
OneDiffusion is a versatile, large-scale diffusion model that supports bidirectional image synthesis and understanding.
It enables conditional generation from inputs such as text, depth, pose, layout, and semantic maps.
OneDiffusion allows for multi-view generation, camera pose estimation, and instant personalization using sequential image inputs.
arXiv Detail & Related papers (2024-11-25T12:11:05Z) - Incremental Multiview Point Cloud Registration with Two-stage Candidate Retrieval [12.528821749262931]
Multiview point cloud registration serves as a cornerstone of various computer vision tasks.
We propose an incremental multiview point cloud registration method that progressively registers all scans to a growing meta-shape.
arXiv Detail & Related papers (2024-07-10T10:24:28Z) - TrackFormers: In Search of Transformer-Based Particle Tracking for the High-Luminosity LHC Era [2.9052912091435923]
High-Energy Physics experiments are facing a multi-fold data increase with every new iteration.
One such step in need of an overhaul is the task of particle track reconstruction, a.k.a., tracking.
A Machine Learning-assisted solution is expected to provide significant improvements.
arXiv Detail & Related papers (2024-07-09T18:47:25Z) - KV Cache Compression, But What Must We Give in Return? A Comprehensive Benchmark of Long Context Capable Approaches [52.02764371205856]
Long context capability is a crucial competency for large language models (LLMs)
This work provides a taxonomy of current methods and evaluating 10+ state-of-the-art approaches across seven categories of long context tasks.
arXiv Detail & Related papers (2024-07-01T17:59:47Z) - Efficient Time Series Processing for Transformers and State-Space Models through Token Merging [44.27818172708914]
Token merging has shown to considerably improve the throughput of vision transformer architectures.
We introduce local merging, a domain-specific token merging algorithm that selectively combines tokens within a local neighborhood.
On the recently proposed Chronos foundation model, we achieve accelerations up to 5400% with only minor accuracy degradations.
arXiv Detail & Related papers (2024-05-28T08:28:18Z) - Deciphering Movement: Unified Trajectory Generation Model for Multi-Agent [53.637837706712794]
We propose a Unified Trajectory Generation model, UniTraj, that processes arbitrary trajectories as masked inputs.
Specifically, we introduce a Ghost Spatial Masking (GSM) module embedded within a Transformer encoder for spatial feature extraction.
We benchmark three practical sports game datasets, Basketball-U, Football-U, and Soccer-U, for evaluation.
arXiv Detail & Related papers (2024-05-27T22:15:23Z) - Single-Stage Visual Relationship Learning using Conditional Queries [60.90880759475021]
TraCQ is a new formulation for scene graph generation that avoids the multi-task learning problem and the entity pair distribution.
We employ a DETR-based encoder-decoder conditional queries to significantly reduce the entity label space as well.
Experimental results show that TraCQ not only outperforms existing single-stage scene graph generation methods, it also beats many state-of-the-art two-stage methods on the Visual Genome dataset.
arXiv Detail & Related papers (2023-06-09T06:02:01Z) - Parameter-efficient Tuning of Large-scale Multimodal Foundation Model [68.24510810095802]
We propose A graceful prompt framework for cross-modal transfer (Aurora) to overcome these challenges.
Considering the redundancy in existing architectures, we first utilize the mode approximation to generate 0.1M trainable parameters to implement the multimodal prompt tuning.
A thorough evaluation on six cross-modal benchmarks shows that it not only outperforms the state-of-the-art but even outperforms the full fine-tuning approach.
arXiv Detail & Related papers (2023-05-15T06:40:56Z) - CloudAttention: Efficient Multi-Scale Attention Scheme For 3D Point
Cloud Learning [81.85951026033787]
We set transformers in this work and incorporate them into a hierarchical framework for shape classification and part and scene segmentation.
We also compute efficient and dynamic global cross attentions by leveraging sampling and grouping at each iteration.
The proposed hierarchical model achieves state-of-the-art shape classification in mean accuracy and yields results on par with the previous segmentation methods.
arXiv Detail & Related papers (2022-07-31T21:39:15Z) - D2C-SR: A Divergence to Convergence Approach for Image Super-Resolution [25.17545119739454]
We present D2C-SR, a novel framework for the task of image super-resolution(SR)
Inspired by recent works like SRFlow, we tackle this problem in a semi-probabilistic manner.
Our experiments demonstrate that D2C-SR can achieve state-of-the-art performance on PSNR and SSIM, with a significantly less computational cost.
arXiv Detail & Related papers (2021-03-26T10:20:28Z) - FarSee-Net: Real-Time Semantic Segmentation by Efficient Multi-scale
Context Aggregation and Feature Space Super-resolution [14.226301825772174]
We introduce a novel and efficient module called Cascaded Factorized Atrous Spatial Pyramid Pooling (CF-ASPP)
It is a lightweight cascaded structure for Convolutional Neural Networks (CNNs) to efficiently leverage context information.
We achieve 68.4% mIoU at 84 fps on the Cityscapes test set with a single Nivida Titan X (Maxwell) GPU card.
arXiv Detail & Related papers (2020-03-09T03:53:57Z) - Learning multiview 3D point cloud registration [74.39499501822682]
We present a novel, end-to-end learnable, multiview 3D point cloud registration algorithm.
Our approach outperforms the state-of-the-art by a significant margin, while being end-to-end trainable and computationally less costly.
arXiv Detail & Related papers (2020-01-15T03:42:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.