Related papers: A Distributed Framework to Orchestrate Video Analytics Applications

A Distributed Framework to Orchestrate Video Analytics Applications

URL: http://arxiv.org/abs/2009.09065v1
Date: Thu, 17 Sep 2020 07:10:05 GMT
Title: A Distributed Framework to Orchestrate Video Analytics Applications
Authors: Tapan Pathak and Vatsal Patel and Sarth Kanani and Shailesh Arya and Pankesh Patel and Muhammad Intizar Ali and John Breslin
Abstract summary: We propose a distributed framework to orchestrate video analytics across Edge and Cloud resources. This paper evaluates the proposed framework as well as the state-of-the-art models and presents comparative analysis of them on various metrics.
Score: 0.09236074230806578
License: http://creativecommons.org/publicdomain/zero/1.0/
Abstract: The concept of the Internet of Things (IoT) is a reality now. This paradigm shift has caught everyones attention in a large class of applications, including IoT-based video analytics using smart doorbells. Due to its growing application segments, various efforts exist in scientific literature and many video-based doorbell solutions are commercially available in the market. However, contemporary offerings are bespoke, offering limited composability and reusability of a smart doorbell framework. Second, they are monolithic and proprietary, which means that the implementation details remain hidden from the users. We believe that a transparent design can greatly aid in the development of a smart doorbell, enabling its use in multiple application domains. To address the above-mentioned challenges, we propose a distributed framework to orchestrate video analytics across Edge and Cloud resources. We investigate trade-offs in the distribution of different software components over a bespoke/full system, where components over Edge and Cloud are treated generically. This paper evaluates the proposed framework as well as the state-of-the-art models and presents comparative analysis of them on various metrics (such as overall model accuracy, latency, memory, and CPU usage). The evaluation result demonstrates our intuition very well, showcasing that the AWS-based approach exhibits reasonably high object-detection accuracy, low memory, and CPU usage when compared to the state-of-the-art approaches, but high latency.

Related papers

Crossing the Reward Bridge: Expanding RL with Verifiable Rewards Across Diverse Domains [92.36624674516553]
Reinforcement learning with verifiable rewards (RLVR) has demonstrated significant success in enhancing mathematical reasoning and coding performance of large language models (LLMs) We investigate the effectiveness and scalability of RLVR across diverse real-world domains including medicine, chemistry, psychology, economics, and education. We utilize a generative scoring technique that yields soft, model-based reward signals to overcome limitations posed by binary verifications.
arXiv Detail & Related papers (2025-03-31T08:22:49Z)
Prompting Video-Language Foundation Models with Domain-specific Fine-grained Heuristics for Video Question Answering [71.62961521518731]
HeurVidQA is a framework that leverages domain-specific entity-actions to refine pre-trained video-language foundation models. Our approach treats these models as implicit knowledge engines, employing domain-specific entity-action prompters to direct the model's focus toward precise cues that enhance reasoning.
arXiv Detail & Related papers (2024-10-12T06:22:23Z)
SAMEdge: An Edge-cloud Video Analytics Architecture for the Segment Anything Model [7.9748022315005]
We propose SAMEdge, a novel edge-cloud computing architecture designed to support SAM computations for edge users. SAMEdge integrates new modules on the edge and the cloud to maximize analytics accuracy under visual prompts and image prompts input with latency constraints.
arXiv Detail & Related papers (2024-09-23T07:59:09Z)
General Object Foundation Model for Images and Videos at Scale [99.2806103051613]
We present GLEE, an object-level foundation model for locating and identifying objects in images and videos. GLEE accomplishes detection, segmentation, tracking, grounding, and identification of arbitrary objects in the open world scenario. We employ an image encoder, text encoder, and visual prompter to handle multi-modal inputs, enabling to simultaneously solve various object-centric downstream tasks.
arXiv Detail & Related papers (2023-12-14T17:26:00Z)
Building Interpretable and Reliable Open Information Retriever for New Domains Overnight [67.03842581848299]
Information retrieval is a critical component for many down-stream tasks such as open-domain question answering (QA) We propose an information retrieval pipeline that uses entity/event linking model and query decomposition model to focus more accurately on different information units of the query. We show that, while being more interpretable and reliable, our proposed pipeline significantly improves passage coverages and denotation accuracies across five IR and QA benchmarks.
arXiv Detail & Related papers (2023-08-09T07:47:17Z)
MAPLE-X: Latency Prediction with Explicit Microprocessor Prior Knowledge [87.41163540910854]
Deep neural network (DNN) latency characterization is a time-consuming process. We propose MAPLE-X which extends MAPLE by incorporating explicit prior knowledge of hardware devices and DNN architecture latency.
arXiv Detail & Related papers (2022-05-25T11:08:20Z)
Scalable Video Object Segmentation with Identification Mechanism [125.4229430216776]
This paper explores the challenges of achieving scalable and effective multi-object modeling for semi-supervised Video Object (VOS) We present two innovative approaches, Associating Objects with Transformers (AOT) and Associating Objects with Scalable Transformers (AOST) Our approaches surpass the state-of-the-art competitors and display exceptional efficiency and scalability consistently across all six benchmarks.
arXiv Detail & Related papers (2022-03-22T03:33:27Z)
Multi-Exit Vision Transformer for Dynamic Inference [88.17413955380262]
We propose seven different architectures for early exit branches that can be used for dynamic inference in Vision Transformer backbones. We show that each one of our proposed architectures could prove useful in the trade-off between accuracy and speed.
arXiv Detail & Related papers (2021-06-29T09:01:13Z)
Internet of Things (IoT) Based Video Analytics: a use case of Smart Doorbell [0.0]
Video-based smart doorbell system is one such application domain for video analytics. This paper proposes a distributed framework for video analytics with a use case of a smart doorbell system. The proposed framework uses AWS cloud services as a base platform and to meet the price affordability constraint, the system was implemented on affordable Raspberry Pi.
arXiv Detail & Related papers (2021-05-13T18:48:48Z)
ApproxDet: Content and Contention-Aware Approximate Object Detection for Mobiles [19.41234144545467]
We introduce ApproxDet, an adaptive video object detection framework for mobile devices to meet accuracy-latency requirements. We evaluate ApproxDet on a large benchmark video dataset and compare quantitatively to AdaScale and YOLOv3. We find that ApproxDet is able to adapt to a wide variety of contention and content characteristics and outshines all baselines.
arXiv Detail & Related papers (2020-10-21T04:11:05Z)
A Demonstration of Smart Doorbell Design Using Federated Deep Learning [0.09786690381850353]
This paper showcases the ability of an intelligent smart doorbell based on Federated Deep Learning. It can deploy and manage video analytics applications such as a smart doorbell across Edge and Cloud resources.
arXiv Detail & Related papers (2020-10-19T17:22:34Z)
Demonstration of a Cloud-based Software Framework for Video Analytics Application using Low-Cost IoT Devices [0.09236074230806578]
We propose a smart doorbell that orchestrates video analytics across Edge and Cloud resources. The proposal uses AWS as a base platform for implementation and leverages COTS affordable devices such as Raspberry Pi in the form of an Edge device.
arXiv Detail & Related papers (2020-09-29T06:05:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.