Related papers: Unsupervised Video Anomaly Detection via Normalizing Flows with Implicit Latent Features

Unsupervised Video Anomaly Detection via Normalizing Flows with Implicit Latent Features

URL: http://arxiv.org/abs/2010.07524v3
Date: Thu, 4 Aug 2022 02:31:16 GMT
Title: Unsupervised Video Anomaly Detection via Normalizing Flows with Implicit Latent Features
Authors: MyeongAh Cho, Taeoh Kim, Woo Jin Kim, Suhwan Cho, Sangyoun Lee
Abstract summary: Most existing methods use an autoencoder to learn to reconstruct normal videos. We propose an implicit two-path AE (ITAE), a structure in which two encoders implicitly model appearance and motion features. For the complex distribution of normal scenes, we suggest normal density estimation of ITAE features. NF models intensify ITAE performance by learning normality through implicitly learned features.
Score: 8.407188666535506
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In contemporary society, surveillance anomaly detection, i.e., spotting anomalous events such as crimes or accidents in surveillance videos, is a critical task. As anomalies occur rarely, most training data consists of unlabeled videos without anomalous events, which makes the task challenging. Most existing methods use an autoencoder (AE) to learn to reconstruct normal videos; they then detect anomalies based on their failure to reconstruct the appearance of abnormal scenes. However, because anomalies are distinguished by appearance as well as motion, many previous approaches have explicitly separated appearance and motion information-for example, using a pre-trained optical flow model. This explicit separation restricts reciprocal representation capabilities between two types of information. In contrast, we propose an implicit two-path AE (ITAE), a structure in which two encoders implicitly model appearance and motion features, along with a single decoder that combines them to learn normal video patterns. For the complex distribution of normal scenes, we suggest normal density estimation of ITAE features through normalizing flow (NF)-based generative models to learn the tractable likelihoods and identify anomalies using out of distribution detection. NF models intensify ITAE performance by learning normality through implicitly learned features. Finally, we demonstrate the effectiveness of ITAE and its feature distribution modeling on six benchmarks, including databases that contain various anomalies in real-world scenarios.

Related papers

Track Any Anomalous Object: A Granular Video Anomaly Detection Pipeline [63.96226274616927]
A new framework called Track Any Anomalous Object (TAO) introduces a granular video anomaly detection pipeline.<n>Unlike methods that assign anomaly scores to every pixel, our approach transforms the problem into pixel-level tracking of anomalous objects.<n>Experiments demonstrate that TAO sets new benchmarks in accuracy and robustness.
arXiv Detail & Related papers (2025-06-05T15:49:39Z)
Language-guided Open-world Video Anomaly Detection [11.65207018549981]
Video anomaly detection models aim to detect anomalies that deviate from what is expected. Existing methods assume that the definition of anomalies is invariable, and thus are not applicable to the open world. We propose a novel open-world VAD paradigm with variable definitions, allowing guided detection through user-provided natural language at inference time.
arXiv Detail & Related papers (2025-03-17T13:31:19Z)
Weakly Supervised Video Anomaly Detection and Localization with Spatio-Temporal Prompts [57.01985221057047]
This paper introduces a novel method that learnstemporal prompt embeddings for weakly supervised video anomaly detection and localization (WSVADL) based on pre-trained vision-language models (VLMs) Our method achieves state-of-theart performance on three public benchmarks for the WSVADL task.
arXiv Detail & Related papers (2024-08-12T03:31:29Z)
VANE-Bench: Video Anomaly Evaluation Benchmark for Conversational LMMs [64.60035916955837]
VANE-Bench is a benchmark designed to assess the proficiency of Video-LMMs in detecting anomalies and inconsistencies in videos. Our dataset comprises an array of videos synthetically generated using existing state-of-the-art text-to-video generation models. We evaluate nine existing Video-LMMs, both open and closed sources, on this benchmarking task and find that most of the models encounter difficulties in effectively identifying the subtle anomalies.
arXiv Detail & Related papers (2024-06-14T17:59:01Z)
Learn Suspected Anomalies from Event Prompts for Video Anomaly Detection [16.77262005540559]
A novel framework is proposed to guide the learning of suspected anomalies from event prompts. It enables a new multi-prompt learning process to constrain the visual-semantic features across all videos. Our proposed model outperforms most state-of-the-art methods in terms of AP or AUC.
arXiv Detail & Related papers (2024-03-02T10:42:47Z)
Dynamic Erasing Network Based on Multi-Scale Temporal Features for Weakly Supervised Video Anomaly Detection [103.92970668001277]
We propose a Dynamic Erasing Network (DE-Net) for weakly supervised video anomaly detection. We first propose a multi-scale temporal modeling module, capable of extracting features from segments of varying lengths. Then, we design a dynamic erasing strategy, which dynamically assesses the completeness of the detected anomalies.
arXiv Detail & Related papers (2023-12-04T09:40:11Z)
Video Anomaly Detection via Spatio-Temporal Pseudo-Anomaly Generation : A Unified Approach [49.995833831087175]
This work proposes a novel method for generating generic Video-temporal PAs by inpainting a masked out region of an image. In addition, we present a simple unified framework to detect real-world anomalies under the OCC setting. Our method performs on par with other existing state-of-the-art PAs generation and reconstruction based methods under the OCC setting.
arXiv Detail & Related papers (2023-11-27T13:14:06Z)
Open-Vocabulary Video Anomaly Detection [57.552523669351636]
Video anomaly detection (VAD) with weak supervision has achieved remarkable performance in utilizing video-level labels to discriminate whether a video frame is normal or abnormal. Recent studies attempt to tackle a more realistic setting, open-set VAD, which aims to detect unseen anomalies given seen anomalies and normal videos. This paper takes a step further and explores open-vocabulary video anomaly detection (OVVAD), in which we aim to leverage pre-trained large models to detect and categorize seen and unseen anomalies.
arXiv Detail & Related papers (2023-11-13T02:54:17Z)
Dual Memory Units with Uncertainty Regulation for Weakly Supervised Video Anomaly Detection [15.991784541576788]
Existing approaches, both video and segment-level label oriented, mainly focus on extracting representations for anomaly data. We propose an Uncertainty Regulated Dual Memory Units (UR-DMU) model to learn both the representations of normal data and discriminative features of abnormal data. Our method outperforms the state-of-the-art methods by a sizable margin.
arXiv Detail & Related papers (2023-02-10T10:39:40Z)
A Video Anomaly Detection Framework based on Appearance-Motion Semantics Representation Consistency [18.06814233420315]
We propose a framework that uses normal data's appearance and motion semantic representation consistency to handle anomaly detection. We design a two-stream encoder to encode the appearance and motion information representations of normal samples. Lower consistency of appearance and motion features of anomalous samples can be used to generate predicted frames with larger reconstruction error.
arXiv Detail & Related papers (2022-04-08T15:59:57Z)
Explainable Deep Few-shot Anomaly Detection with Deviation Networks [123.46611927225963]
We introduce a novel weakly-supervised anomaly detection framework to train detection models. The proposed approach learns discriminative normality by leveraging the labeled anomalies and a prior probability. Our model is substantially more sample-efficient and robust, and performs significantly better than state-of-the-art competing methods in both closed-set and open-set settings.
arXiv Detail & Related papers (2021-08-01T14:33:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.