Boggart: Accelerating Retrospective Video Analytics via Model-Agnostic
Ingest Processing
- URL: http://arxiv.org/abs/2106.15315v1
- Date: Mon, 21 Jun 2021 19:21:16 GMT
- Title: Boggart: Accelerating Retrospective Video Analytics via Model-Agnostic
Ingest Processing
- Authors: Neil Agarwal, Ravi Netravali
- Abstract summary: Boggart is a retrospective video analytics system that delivers ingest-time speedups in a model-agnostic manner.
Our underlying insight is that traditional computer vision (CV) algorithms are capable of performing computations that can be used to accelerate diverse queries with wide-ranging CNNs.
At query-time, Boggart uses several novel techniques to collect the smallest sample of CNN results required to meet the target accuracy.
- Score: 5.076419064097734
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Delivering fast responses to retrospective queries on video datasets is
difficult due to the large number of frames to consider and the high costs of
running convolutional neural networks (CNNs) on each one. A natural solution is
to perform a subset of the necessary computations ahead of time, as video is
ingested. However, existing ingest-time systems require knowledge of the
specific CNN that will be used in future queries -- a challenging requisite
given the evergrowing space of CNN architectures and training
datasets/methodologies.
This paper presents Boggart, a retrospective video analytics system that
delivers ingest-time speedups in a model-agnostic manner. Our underlying
insight is that traditional computer vision (CV) algorithms are capable of
performing computations that can be used to accelerate diverse queries with
wide-ranging CNNs. Building on this, at ingest-time, Boggart carefully employs
a variety of motion tracking algorithms to identify potential objects and their
trajectories across frames. Then, at query-time, Boggart uses several novel
techniques to collect the smallest sample of CNN results required to meet the
target accuracy: (1) a clustering strategy to efficiently unearth the
inevitable discrepancies between CV- and CNN-generated outputs, and (2) a set
of accuracy-preserving propagation techniques to safely extend sampled results
along each trajectory. Across many videos, CNNs, and queries Boggart
consistently meets accuracy targets while using CNNs sparingly (on 3-54% of
frames).
Related papers
- SparseTem: Boosting the Efficiency of CNN-Based Video Encoders by Exploiting Temporal Continuity [15.872209884833977]
We propose a memory-efficient scheduling method to eliminate memory overhead and an online adjustment mechanism to minimize accuracy degradation.
SparseTem achieves speedup of 1.79x for EfficientDet and 4.72x for CRNN, with minimal accuracy drop and no additional memory overhead.
arXiv Detail & Related papers (2024-10-28T07:13:25Z) - Transferability of Convolutional Neural Networks in Stationary Learning
Tasks [96.00428692404354]
We introduce a novel framework for efficient training of convolutional neural networks (CNNs) for large-scale spatial problems.
We show that a CNN trained on small windows of such signals achieves a nearly performance on much larger windows without retraining.
Our results show that the CNN is able to tackle problems with many hundreds of agents after being trained with fewer than ten.
arXiv Detail & Related papers (2023-07-21T13:51:45Z) - A Novel Hand Gesture Detection and Recognition system based on
ensemble-based Convolutional Neural Network [3.5665681694253903]
Detection of hand portion has become a challenging task in computer vision and pattern recognition communities.
Deep learning algorithm like convolutional neural network (CNN) architecture has become a very popular choice for classification tasks.
In this paper, an ensemble of CNN-based approaches is presented to overcome some problems like high variance during prediction, overfitting problem and also prediction errors.
arXiv Detail & Related papers (2022-02-25T06:46:58Z) - Event and Activity Recognition in Video Surveillance for Cyber-Physical
Systems [0.0]
Long-term motion patterns alone play a pivotal role in the task of recognizing an event.
We show that the long-term motion patterns alone play a pivotal role in the task of recognizing an event.
Only the temporal features are exploited using a hybrid Convolutional Neural Network (CNN) + Recurrent Neural Network (RNN) architecture.
arXiv Detail & Related papers (2021-11-03T08:30:38Z) - Application of 2-D Convolutional Neural Networks for Damage Detection in
Steel Frame Structures [0.0]
We present an application of 2-D convolutional neural networks (2-D CNNs) designed to perform both feature extraction and classification stages.
The method uses a network of lighted CNNs instead of deep and takes raw acceleration signals as input.
arXiv Detail & Related papers (2021-10-29T16:29:31Z) - An Acceleration Method Based on Deep Learning and Multilinear Feature
Space [0.0]
This paper presents an alternative approach based on the Multilinear Feature Space (MFS) method resorting to transfer learning from large CNN architectures.
The proposed method uses CNNs to generate feature maps, although it does not work as complexity reduction approach.
Our method, named AMFC, uses the transfer learning from pre-trained CNN to reduce the classification time of new sample image, with minimal accuracy loss.
arXiv Detail & Related papers (2021-10-16T23:49:12Z) - Learning from Images: Proactive Caching with Parallel Convolutional
Neural Networks [94.85780721466816]
A novel framework for proactive caching is proposed in this paper.
It combines model-based optimization with data-driven techniques by transforming an optimization problem into a grayscale image.
Numerical results show that the proposed scheme can reduce 71.6% computation time with only 0.8% additional performance cost.
arXiv Detail & Related papers (2021-08-15T21:32:47Z) - Continual 3D Convolutional Neural Networks for Real-time Processing of
Videos [93.73198973454944]
We introduce Continual 3D Contemporalal Neural Networks (Co3D CNNs)
Co3D CNNs process videos frame-by-frame rather than by clip by clip.
We show that Co3D CNNs initialised on the weights from preexisting state-of-the-art video recognition models reduce floating point operations for frame-wise computations by 10.0-12.4x while improving accuracy on Kinetics-400 by 2.3-3.8x.
arXiv Detail & Related papers (2021-05-31T18:30:52Z) - MoViNets: Mobile Video Networks for Efficient Video Recognition [52.49314494202433]
3D convolutional neural networks (CNNs) are accurate at video recognition but require large computation and memory budgets.
We propose a three-step approach to improve computational efficiency while substantially reducing the peak memory usage of 3D CNNs.
arXiv Detail & Related papers (2021-03-21T23:06:38Z) - Dense Interaction Learning for Video-based Person Re-identification [75.03200492219003]
We propose a hybrid framework, Dense Interaction Learning (DenseIL), to tackle video-based person re-ID difficulties.
DenseIL contains a CNN encoder and a Dense Interaction (DI) decoder.
Our experiments consistently and significantly outperform all the state-of-the-art methods on multiple standard video-based re-ID datasets.
arXiv Detail & Related papers (2021-03-16T12:22:08Z) - Cascaded Deep Video Deblurring Using Temporal Sharpness Prior [88.98348546566675]
The proposed algorithm mainly consists of optical flow estimation from intermediate latent frames and latent frame restoration steps.
It first develops a deep CNN model to estimate optical flow from intermediate latent frames and then restores the latent frames based on the estimated optical flow.
We show that exploring the domain knowledge of video deblurring is able to make the deep CNN model more compact and efficient.
arXiv Detail & Related papers (2020-04-06T09:13:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.