Context Recovery and Knowledge Retrieval: A Novel Two-Stream Framework
for Video Anomaly Detection
- URL: http://arxiv.org/abs/2209.02899v1
- Date: Wed, 7 Sep 2022 03:12:02 GMT
- Title: Context Recovery and Knowledge Retrieval: A Novel Two-Stream Framework
for Video Anomaly Detection
- Authors: Congqi Cao, Yue Lu and Yanning Zhang
- Abstract summary: We propose a two-stream framework based on context recovery and knowledge retrieval.
For the context recovery stream, we propose a U-Net which can fully utilize the motion information to predict the future frame.
For the knowledge retrieval stream, we propose an improved learnable locality-sensitive hashing.
The knowledge about normality is encoded and stored in hash tables, and the distance between the testing event and the knowledge representation is used to reveal the probability of anomaly.
- Score: 48.05512963355003
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Video anomaly detection aims to find the events in a video that do not
conform to the expected behavior. The prevalent methods mainly detect anomalies
by snippet reconstruction or future frame prediction error. However, the error
is highly dependent on the local context of the current snippet and lacks the
understanding of normality. To address this issue, we propose to detect
anomalous events not only by the local context, but also according to the
consistency between the testing event and the knowledge about normality from
the training data. Concretely, we propose a novel two-stream framework based on
context recovery and knowledge retrieval, where the two streams can complement
each other. For the context recovery stream, we propose a spatiotemporal U-Net
which can fully utilize the motion information to predict the future frame.
Furthermore, we propose a maximum local error mechanism to alleviate the
problem of large recovery errors caused by complex foreground objects. For the
knowledge retrieval stream, we propose an improved learnable locality-sensitive
hashing, which optimizes hash functions via a Siamese network and a mutual
difference loss. The knowledge about normality is encoded and stored in hash
tables, and the distance between the testing event and the knowledge
representation is used to reveal the probability of anomaly. Finally, we fuse
the anomaly scores from the two streams to detect anomalies. Extensive
experiments demonstrate the effectiveness and complementarity of the two
streams, whereby the proposed two-stream framework achieves state-of-the-art
performance on four datasets.
Related papers
- Weakly Supervised Video Anomaly Detection and Localization with Spatio-Temporal Prompts [57.01985221057047]
This paper introduces a novel method that learnstemporal prompt embeddings for weakly supervised video anomaly detection and localization (WSVADL) based on pre-trained vision-language models (VLMs)
Our method achieves state-of-theart performance on three public benchmarks for the WSVADL task.
arXiv Detail & Related papers (2024-08-12T03:31:29Z) - Towards Video Anomaly Retrieval from Video Anomaly Detection: New
Benchmarks and Model [70.97446870672069]
Video anomaly detection (VAD) has been paid increasing attention due to its potential applications.
Video Anomaly Retrieval ( VAR) aims to pragmatically retrieve relevant anomalous videos by cross-modalities.
We present two benchmarks, UCFCrime-AR and XD-Violence, constructed on top of prevalent anomaly datasets.
arXiv Detail & Related papers (2023-07-24T06:22:37Z) - Making Reconstruction-based Method Great Again for Video Anomaly
Detection [64.19326819088563]
Anomaly detection in videos is a significant yet challenging problem.
Existing reconstruction-based methods rely on old-fashioned convolutional autoencoders.
We propose a new autoencoder model for enhanced consecutive frame reconstruction.
arXiv Detail & Related papers (2023-01-28T01:57:57Z) - Spatio-Temporal-based Context Fusion for Video Anomaly Detection [1.7710335706046505]
Video anomaly aims to discover abnormal events in videos, and the principal objects are target objects such as people and vehicles.
Most existing methods only focus on the temporal context, ignoring the role of the spatial context in anomaly detection.
This paper proposes a video anomaly detection algorithm based on target-temporal context fusion.
arXiv Detail & Related papers (2022-10-18T04:07:10Z) - Multi-Contextual Predictions with Vision Transformer for Video Anomaly
Detection [22.098399083491937]
understanding of thetemporal context of a video plays a vital role in anomaly detection.
We design a transformer model with three different contextual prediction streams: masked, whole and partial.
By learning to predict the missing frames of consecutive normal frames, our model can effectively learn various normality patterns in the video.
arXiv Detail & Related papers (2022-06-17T05:54:31Z) - Discrete neural representations for explainable anomaly detection [25.929134751869032]
We show how to robustly detect anomalies without the use of object or action classifiers.
We also show how to improve the quality of saliency maps using a novel neural architecture for learning discrete representations of video.
arXiv Detail & Related papers (2021-12-10T14:56:58Z) - Convolutional Transformer based Dual Discriminator Generative
Adversarial Networks for Video Anomaly Detection [27.433162897608543]
We propose Conversaal Transformer based Dual Discriminator Generative Adrial Networks (CT-D2GAN) to perform unsupervised video anomaly detection.
It contains three key components, i., a convolutional encoder to capture the spatial information of input clips, a temporal self-attention module to encode the temporal dynamics and predict the future frame.
arXiv Detail & Related papers (2021-07-29T03:07:25Z) - Robust Unsupervised Video Anomaly Detection by Multi-Path Frame
Prediction [61.17654438176999]
We propose a novel and robust unsupervised video anomaly detection method by frame prediction with proper design.
Our proposed method obtains the frame-level AUROC score of 88.3% on the CUHK Avenue dataset.
arXiv Detail & Related papers (2020-11-05T11:34:12Z) - A Background-Agnostic Framework with Adversarial Training for Abnormal
Event Detection in Video [120.18562044084678]
Abnormal event detection in video is a complex computer vision problem that has attracted significant attention in recent years.
We propose a background-agnostic framework that learns from training videos containing only normal events.
arXiv Detail & Related papers (2020-08-27T18:39:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.