Unsupervised Video Anomaly Detection via Normalizing Flows with Implicit
Latent Features
- URL: http://arxiv.org/abs/2010.07524v3
- Date: Thu, 4 Aug 2022 02:31:16 GMT
- Title: Unsupervised Video Anomaly Detection via Normalizing Flows with Implicit
Latent Features
- Authors: MyeongAh Cho, Taeoh Kim, Woo Jin Kim, Suhwan Cho, Sangyoun Lee
- Abstract summary: Most existing methods use an autoencoder to learn to reconstruct normal videos.
We propose an implicit two-path AE (ITAE), a structure in which two encoders implicitly model appearance and motion features.
For the complex distribution of normal scenes, we suggest normal density estimation of ITAE features.
NF models intensify ITAE performance by learning normality through implicitly learned features.
- Score: 8.407188666535506
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In contemporary society, surveillance anomaly detection, i.e., spotting
anomalous events such as crimes or accidents in surveillance videos, is a
critical task. As anomalies occur rarely, most training data consists of
unlabeled videos without anomalous events, which makes the task challenging.
Most existing methods use an autoencoder (AE) to learn to reconstruct normal
videos; they then detect anomalies based on their failure to reconstruct the
appearance of abnormal scenes. However, because anomalies are distinguished by
appearance as well as motion, many previous approaches have explicitly
separated appearance and motion information-for example, using a pre-trained
optical flow model. This explicit separation restricts reciprocal
representation capabilities between two types of information. In contrast, we
propose an implicit two-path AE (ITAE), a structure in which two encoders
implicitly model appearance and motion features, along with a single decoder
that combines them to learn normal video patterns. For the complex distribution
of normal scenes, we suggest normal density estimation of ITAE features through
normalizing flow (NF)-based generative models to learn the tractable
likelihoods and identify anomalies using out of distribution detection. NF
models intensify ITAE performance by learning normality through implicitly
learned features. Finally, we demonstrate the effectiveness of ITAE and its
feature distribution modeling on six benchmarks, including databases that
contain various anomalies in real-world scenarios.
Related papers
- Weakly Supervised Video Anomaly Detection and Localization with Spatio-Temporal Prompts [57.01985221057047]
This paper introduces a novel method that learnstemporal prompt embeddings for weakly supervised video anomaly detection and localization (WSVADL) based on pre-trained vision-language models (VLMs)
Our method achieves state-of-theart performance on three public benchmarks for the WSVADL task.
arXiv Detail & Related papers (2024-08-12T03:31:29Z) - VANE-Bench: Video Anomaly Evaluation Benchmark for Conversational LMMs [64.60035916955837]
VANE-Bench is a benchmark designed to assess the proficiency of Video-LMMs in detecting anomalies and inconsistencies in videos.
Our dataset comprises an array of videos synthetically generated using existing state-of-the-art text-to-video generation models.
We evaluate nine existing Video-LMMs, both open and closed sources, on this benchmarking task and find that most of the models encounter difficulties in effectively identifying the subtle anomalies.
arXiv Detail & Related papers (2024-06-14T17:59:01Z) - Learn Suspected Anomalies from Event Prompts for Video Anomaly Detection [16.77262005540559]
A novel framework is proposed to guide the learning of suspected anomalies from event prompts.
It enables a new multi-prompt learning process to constrain the visual-semantic features across all videos.
Our proposed model outperforms most state-of-the-art methods in terms of AP or AUC.
arXiv Detail & Related papers (2024-03-02T10:42:47Z) - Dynamic Erasing Network Based on Multi-Scale Temporal Features for
Weakly Supervised Video Anomaly Detection [103.92970668001277]
We propose a Dynamic Erasing Network (DE-Net) for weakly supervised video anomaly detection.
We first propose a multi-scale temporal modeling module, capable of extracting features from segments of varying lengths.
Then, we design a dynamic erasing strategy, which dynamically assesses the completeness of the detected anomalies.
arXiv Detail & Related papers (2023-12-04T09:40:11Z) - Video Anomaly Detection via Spatio-Temporal Pseudo-Anomaly Generation : A Unified Approach [49.995833831087175]
This work proposes a novel method for generating generic Video-temporal PAs by inpainting a masked out region of an image.
In addition, we present a simple unified framework to detect real-world anomalies under the OCC setting.
Our method performs on par with other existing state-of-the-art PAs generation and reconstruction based methods under the OCC setting.
arXiv Detail & Related papers (2023-11-27T13:14:06Z) - Open-Vocabulary Video Anomaly Detection [57.552523669351636]
Video anomaly detection (VAD) with weak supervision has achieved remarkable performance in utilizing video-level labels to discriminate whether a video frame is normal or abnormal.
Recent studies attempt to tackle a more realistic setting, open-set VAD, which aims to detect unseen anomalies given seen anomalies and normal videos.
This paper takes a step further and explores open-vocabulary video anomaly detection (OVVAD), in which we aim to leverage pre-trained large models to detect and categorize seen and unseen anomalies.
arXiv Detail & Related papers (2023-11-13T02:54:17Z) - Dual Memory Units with Uncertainty Regulation for Weakly Supervised
Video Anomaly Detection [15.991784541576788]
Existing approaches, both video and segment-level label oriented, mainly focus on extracting representations for anomaly data.
We propose an Uncertainty Regulated Dual Memory Units (UR-DMU) model to learn both the representations of normal data and discriminative features of abnormal data.
Our method outperforms the state-of-the-art methods by a sizable margin.
arXiv Detail & Related papers (2023-02-10T10:39:40Z) - A Video Anomaly Detection Framework based on Appearance-Motion Semantics
Representation Consistency [18.06814233420315]
We propose a framework that uses normal data's appearance and motion semantic representation consistency to handle anomaly detection.
We design a two-stream encoder to encode the appearance and motion information representations of normal samples.
Lower consistency of appearance and motion features of anomalous samples can be used to generate predicted frames with larger reconstruction error.
arXiv Detail & Related papers (2022-04-08T15:59:57Z) - Explainable Deep Few-shot Anomaly Detection with Deviation Networks [123.46611927225963]
We introduce a novel weakly-supervised anomaly detection framework to train detection models.
The proposed approach learns discriminative normality by leveraging the labeled anomalies and a prior probability.
Our model is substantially more sample-efficient and robust, and performs significantly better than state-of-the-art competing methods in both closed-set and open-set settings.
arXiv Detail & Related papers (2021-08-01T14:33:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.