Video Monitoring Queries
- URL: http://arxiv.org/abs/2002.10537v1
- Date: Mon, 24 Feb 2020 20:53:35 GMT
- Title: Video Monitoring Queries
- Authors: Nick Koudas, Raymond Li, Ioannis Xarchakos
- Abstract summary: We study the problem of interactive declarative query processing on video streams.
We introduce a set of approximate filters to speed up queries that involve objects of specific type.
The filters are able to assess quickly if the query predicates are true to proceed with further analysis of the frame.
- Score: 16.7214343633499
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent advances in video processing utilizing deep learning primitives
achieved breakthroughs in fundamental problems in video analysis such as frame
classification and object detection enabling an array of new applications.
In this paper we study the problem of interactive declarative query
processing on video streams. In particular we introduce a set of approximate
filters to speed up queries that involve objects of specific type (e.g., cars,
trucks, etc.) on video frames with associated spatial relationships among them
(e.g., car left of truck). The resulting filters are able to assess quickly if
the query predicates are true to proceed with further analysis of the frame or
otherwise not consider the frame further avoiding costly object detection
operations.
We propose two classes of filters $IC$ and $OD$, that adapt principles from
deep image classification and object detection. The filters utilize extensible
deep neural architectures and are easy to deploy and utilize. In addition, we
propose statistical query processing techniques to process aggregate queries
involving objects with spatial constraints on video streams and demonstrate
experimentally the resulting increased accuracy on the resulting aggregate
estimation.
Combined these techniques constitute a robust set of video monitoring query
processing techniques. We demonstrate that the application of the techniques
proposed in conjunction with declarative queries on video streams can
dramatically increase the frame processing rate and speed up query processing
by at least two orders of magnitude. We present the results of a thorough
experimental study utilizing benchmark video data sets at scale demonstrating
the performance benefits and the practical relevance of our proposals.
Related papers
- GQE: Generalized Query Expansion for Enhanced Text-Video Retrieval [56.610806615527885]
This paper introduces a novel data-centric approach, Generalized Query Expansion (GQE), to address the inherent information imbalance between text and video.
By adaptively segmenting videos into short clips and employing zero-shot captioning, GQE enriches the training dataset with comprehensive scene descriptions.
GQE achieves state-of-the-art performance on several benchmarks, including MSR-VTT, MSVD, LSMDC, and VATEX.
arXiv Detail & Related papers (2024-08-14T01:24:09Z) - Practical Video Object Detection via Feature Selection and Aggregation [18.15061460125668]
Video object detection (VOD) needs to concern the high across-frame variation in object appearance, and the diverse deterioration in some frames.
Most of contemporary aggregation methods are tailored for two-stage detectors, suffering from high computational costs.
This study invents a very simple yet potent strategy of feature selection and aggregation, gaining significant accuracy at marginal computational expense.
arXiv Detail & Related papers (2024-07-29T02:12:11Z) - Temporal Saliency Query Network for Efficient Video Recognition [82.52760040577864]
Video recognition is a hot-spot research topic with the explosive growth of multimedia data on the Internet and mobile devices.
Most existing methods select the salient frames without awareness of the class-specific saliency scores.
We propose a novel Temporal Saliency Query (TSQ) mechanism, which introduces class-specific information to provide fine-grained cues for saliency measurement.
arXiv Detail & Related papers (2022-07-21T09:23:34Z) - FrameHopper: Selective Processing of Video Frames in Detection-driven
Real-Time Video Analytics [2.5119455331413376]
Detection-driven real-time video analytics require continuous detection of objects contained in the video frames.
Running these detectors on each and every frame in resource-constrained edge devices is computationally intensive.
We propose an off-line Reinforcement Learning (RL)-based algorithm to determine these skip-lengths.
arXiv Detail & Related papers (2022-03-22T07:05:57Z) - Temporal Query Networks for Fine-grained Video Understanding [88.9877174286279]
We cast this into a query-response mechanism, where each query addresses a particular question, and has its own response label set.
We evaluate the method extensively on the FineGym and Diving48 benchmarks for fine-grained action classification and surpass the state-of-the-art using only RGB features.
arXiv Detail & Related papers (2021-04-19T17:58:48Z) - Coherent Loss: A Generic Framework for Stable Video Segmentation [103.78087255807482]
We investigate how a jittering artifact degrades the visual quality of video segmentation results.
We propose a Coherent Loss with a generic framework to enhance the performance of a neural network against jittering artifacts.
arXiv Detail & Related papers (2020-10-25T10:48:28Z) - Robust and efficient post-processing for video object detection [9.669942356088377]
This work introduces a novel post-processing pipeline that overcomes some of the limitations of previous post-processing methods.
Our method improves the results of state-of-the-art specific video detectors, specially regarding fast moving objects.
And applied to efficient still image detectors, such as YOLO, provides comparable results to much more computationally intensive detectors.
arXiv Detail & Related papers (2020-09-23T10:47:24Z) - Temporal Context Aggregation for Video Retrieval with Contrastive
Learning [81.12514007044456]
We propose TCA, a video representation learning framework that incorporates long-range temporal information between frame-level features.
The proposed method shows a significant performance advantage (17% mAP on FIVR-200K) over state-of-the-art methods with video-level features.
arXiv Detail & Related papers (2020-08-04T05:24:20Z) - Evaluating Temporal Queries Over Video Feeds [25.04363138106074]
Temporal queries involving objects and their co-occurrences in video feeds are of interest to many applications ranging from law enforcement to security and safety.
We present an architecture consisting of three layers, namely object detection/tracking, intermediate data generation and query evaluation.
We propose two techniques,MFS and SSG, to organize all detected objects in the intermediate data generation layer.
We also introduce an algorithm called State Traversal (ST) that processes incoming frames against the SSG and efficiently prunes objects and frames unrelated to query evaluation.
arXiv Detail & Related papers (2020-03-02T14:55:57Z) - Convolutional Hierarchical Attention Network for Query-Focused Video
Summarization [74.48782934264094]
This paper addresses the task of query-focused video summarization, which takes user's query and a long video as inputs.
We propose a method, named Convolutional Hierarchical Attention Network (CHAN), which consists of two parts: feature encoding network and query-relevance computing module.
In the encoding network, we employ a convolutional network with local self-attention mechanism and query-aware global attention mechanism to learns visual information of each shot.
arXiv Detail & Related papers (2020-01-31T04:30:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.