Dual Conditioned Motion Diffusion for Pose-Based Video Anomaly Detection
- URL: http://arxiv.org/abs/2412.17210v1
- Date: Mon, 23 Dec 2024 01:31:39 GMT
- Title: Dual Conditioned Motion Diffusion for Pose-Based Video Anomaly Detection
- Authors: Andi Xu, Hongsong Wang, Pinle Ding, Jie Gui,
- Abstract summary: Video Anomaly Detection (VAD) is essential for computer vision research.
Existing VAD methods utilize either reconstruction-based or prediction-based frameworks.
We address pose-based video anomaly detection and introduce a novel framework called Dual Conditioned Motion Diffusion.
- Score: 12.100563798908777
- License:
- Abstract: Video Anomaly Detection (VAD) is essential for computer vision research. Existing VAD methods utilize either reconstruction-based or prediction-based frameworks. The former excels at detecting irregular patterns or structures, whereas the latter is capable of spotting abnormal deviations or trends. We address pose-based video anomaly detection and introduce a novel framework called Dual Conditioned Motion Diffusion (DCMD), which enjoys the advantages of both approaches. The DCMD integrates conditioned motion and conditioned embedding to comprehensively utilize the pose characteristics and latent semantics of observed movements, respectively. In the reverse diffusion process, a motion transformer is proposed to capture potential correlations from multi-layered characteristics within the spectrum space of human motion. To enhance the discriminability between normal and abnormal instances, we design a novel United Association Discrepancy (UAD) regularization that primarily relies on a Gaussian kernel-based time association and a self-attention-based global association. Finally, a mask completion strategy is introduced during the inference stage of the reverse diffusion process to enhance the utilization of conditioned motion for the prediction branch of anomaly detection. Extensive experiments on four datasets demonstrate that our method dramatically outperforms state-of-the-art methods and exhibits superior generalization performance.
Related papers
- Appearance Blur-driven AutoEncoder and Motion-guided Memory Module for Video Anomaly Detection [14.315287192621662]
Video anomaly detection (VAD) often learns the distribution of normal samples and detects the anomaly through measuring significant deviations.
Most VADs cannot cope with cross-dataset validation for new target domains.
We propose a novel VAD method with a motion-guided memory module to achieve cross-dataset validation with zero-shot.
arXiv Detail & Related papers (2024-09-26T07:48:20Z) - Cross-Modal Learning for Anomaly Detection in Complex Industrial Process: Methodology and Benchmark [19.376814754500625]
Anomaly detection in complex industrial processes plays a pivotal role in ensuring efficient, stable, and secure operation.
This paper proposes a cross-modal Transformer to facilitate anomaly detection by exploring the correlation between visual features (video) and process variables (current) in the context of the fused magnesium smelting process.
We present a pioneering cross-modal benchmark of the fused magnesium smelting process, featuring synchronously acquired video and current data for over 2.2 million samples.
arXiv Detail & Related papers (2024-06-13T11:40:06Z) - GLAD: Towards Better Reconstruction with Global and Local Adaptive Diffusion Models for Unsupervised Anomaly Detection [60.78684630040313]
Diffusion models tend to reconstruct normal counterparts of test images with certain noises added.
From the global perspective, the difficulty of reconstructing images with different anomalies is uneven.
We propose a global and local adaptive diffusion model (abbreviated to GLAD) for unsupervised anomaly detection.
arXiv Detail & Related papers (2024-06-11T17:27:23Z) - DiAD: A Diffusion-based Framework for Multi-class Anomaly Detection [55.48770333927732]
We propose a Difusion-based Anomaly Detection (DiAD) framework for multi-class anomaly detection.
It consists of a pixel-space autoencoder, a latent-space Semantic-Guided (SG) network with a connection to the stable diffusion's denoising network, and a feature-space pre-trained feature extractor.
Experiments on MVTec-AD and VisA datasets demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2023-12-11T18:38:28Z) - Video Anomaly Detection via Spatio-Temporal Pseudo-Anomaly Generation : A Unified Approach [49.995833831087175]
This work proposes a novel method for generating generic Video-temporal PAs by inpainting a masked out region of an image.
In addition, we present a simple unified framework to detect real-world anomalies under the OCC setting.
Our method performs on par with other existing state-of-the-art PAs generation and reconstruction based methods under the OCC setting.
arXiv Detail & Related papers (2023-11-27T13:14:06Z) - Unsupervised Discovery of Interpretable Directions in h-space of
Pre-trained Diffusion Models [63.1637853118899]
We propose the first unsupervised and learning-based method to identify interpretable directions in h-space of pre-trained diffusion models.
We employ a shift control module that works on h-space of pre-trained diffusion models to manipulate a sample into a shifted version of itself.
By jointly optimizing them, the model will spontaneously discover disentangled and interpretable directions.
arXiv Detail & Related papers (2023-10-15T18:44:30Z) - Unsupervised Video Anomaly Detection with Diffusion Models Conditioned
on Compact Motion Representations [17.816344808780965]
unsupervised video anomaly detection (VAD) problem involves classifying each frame in a video as normal or abnormal, without any access to labels.
To accomplish this, proposed method employs conditional diffusion models, where the input data is features extracted from pre-trained network.
Our method utilizes a data-driven threshold and considers a high reconstruction error as an indicator of anomalous events.
arXiv Detail & Related papers (2023-07-04T07:36:48Z) - The role of noise in denoising models for anomaly detection in medical
images [62.0532151156057]
Pathological brain lesions exhibit diverse appearance in brain images.
Unsupervised anomaly detection approaches have been proposed using only normal data for training.
We show that optimization of the spatial resolution and magnitude of the noise improves the performance of different model training regimes.
arXiv Detail & Related papers (2023-01-19T21:39:38Z) - Object-centric and memory-guided normality reconstruction for video
anomaly detection [56.64792194894702]
This paper addresses anomaly detection problem for videosurveillance.
Due to the inherent rarity and heterogeneity of abnormal events, the problem is viewed as a normality modeling strategy.
Our model learns object-centric normal patterns without seeing anomalous samples during training.
arXiv Detail & Related papers (2022-03-07T19:28:39Z) - Regularity Learning via Explicit Distribution Modeling for Skeletal
Video Anomaly Detection [43.004613173363566]
A novel Motion Embedder (ME) is proposed to provide a pose motion representation from the probability perspective.
A novel task-specific Spatial-Temporal Transformer (STT) is deployed for self-supervised pose sequence reconstruction.
MoPRL achieves the state-of-the-art performance by an average improvement of 4.7% AUC on several challenging datasets.
arXiv Detail & Related papers (2021-12-07T11:52:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.