Multimodal Motion Conditioned Diffusion Model for Skeleton-based Video
Anomaly Detection
- URL: http://arxiv.org/abs/2307.07205v3
- Date: Mon, 28 Aug 2023 10:41:07 GMT
- Title: Multimodal Motion Conditioned Diffusion Model for Skeleton-based Video
Anomaly Detection
- Authors: Alessandro Flaborea, Luca Collorone, Guido D'Amely, Stefano D'Arrigo,
Bardh Prenkaj, Fabio Galasso
- Abstract summary: We propose a novel generative model for video anomaly detection (VAD)
We consider skeletal representations and leverage state-of-the-art diffusion probabilistic models to generate multimodal future human poses.
We validate our model on 4 established benchmarks.
- Score: 46.8584162860564
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Anomalies are rare and anomaly detection is often therefore framed as
One-Class Classification (OCC), i.e. trained solely on normalcy. Leading OCC
techniques constrain the latent representations of normal motions to limited
volumes and detect as abnormal anything outside, which accounts satisfactorily
for the openset'ness of anomalies. But normalcy shares the same openset'ness
property since humans can perform the same action in several ways, which the
leading techniques neglect. We propose a novel generative model for video
anomaly detection (VAD), which assumes that both normality and abnormality are
multimodal. We consider skeletal representations and leverage state-of-the-art
diffusion probabilistic models to generate multimodal future human poses. We
contribute a novel conditioning on the past motion of people and exploit the
improved mode coverage capabilities of diffusion processes to generate
different-but-plausible future motions. Upon the statistical aggregation of
future modes, an anomaly is detected when the generated set of motions is not
pertinent to the actual future. We validate our model on 4 established
benchmarks: UBnormal, HR-UBnormal, HR-STC, and HR-Avenue, with extensive
experiments surpassing state-of-the-art results.
Related papers
- Ensembled Cold-Diffusion Restorations for Unsupervised Anomaly Detection [7.94529540044472]
Unsupervised Anomaly Detection (UAD) methods aim to identify anomalies in test samples comparing them with a normative distribution learned from a dataset known to be anomaly-free.
Approaches based on generative models offer interpretability by generating anomaly-free versions of test images, but are typically unable to identify subtle anomalies.
We present a novel method that combines the strengths of both strategies: a generative cold-diffusion pipeline that is trained with the objective of turning synthetically-corrupted images back to their normal, original appearance.
arXiv Detail & Related papers (2024-07-09T08:02:46Z) - GLAD: Towards Better Reconstruction with Global and Local Adaptive Diffusion Models for Unsupervised Anomaly Detection [60.78684630040313]
Diffusion models tend to reconstruct normal counterparts of test images with certain noises added.
From the global perspective, the difficulty of reconstructing images with different anomalies is uneven.
We propose a global and local adaptive diffusion model (abbreviated to GLAD) for unsupervised anomaly detection.
arXiv Detail & Related papers (2024-06-11T17:27:23Z) - AnomalyDiffusion: Few-Shot Anomaly Image Generation with Diffusion Model [59.08735812631131]
Anomaly inspection plays an important role in industrial manufacture.
Existing anomaly inspection methods are limited in their performance due to insufficient anomaly data.
We propose AnomalyDiffusion, a novel diffusion-based few-shot anomaly generation model.
arXiv Detail & Related papers (2023-12-10T05:13:40Z) - Video Anomaly Detection via Spatio-Temporal Pseudo-Anomaly Generation : A Unified Approach [49.995833831087175]
This work proposes a novel method for generating generic Video-temporal PAs by inpainting a masked out region of an image.
In addition, we present a simple unified framework to detect real-world anomalies under the OCC setting.
Our method performs on par with other existing state-of-the-art PAs generation and reconstruction based methods under the OCC setting.
arXiv Detail & Related papers (2023-11-27T13:14:06Z) - Open-Vocabulary Video Anomaly Detection [57.552523669351636]
Video anomaly detection (VAD) with weak supervision has achieved remarkable performance in utilizing video-level labels to discriminate whether a video frame is normal or abnormal.
Recent studies attempt to tackle a more realistic setting, open-set VAD, which aims to detect unseen anomalies given seen anomalies and normal videos.
This paper takes a step further and explores open-vocabulary video anomaly detection (OVVAD), in which we aim to leverage pre-trained large models to detect and categorize seen and unseen anomalies.
arXiv Detail & Related papers (2023-11-13T02:54:17Z) - Open-Set Multivariate Time-Series Anomaly Detection [7.127829790714167]
Time-series anomaly detection methods assume that only normal samples are available during the training phase.
Supervised methods can be utilized to classify normal and seen anomalies, but they tend to overfit to the seen anomalies during training.
We propose the first algorithm to address the open-set TSAD problem, called Multivariate Open-Set Time-Series Anomaly Detector (MOSAD)
MOSAD is a novel multi-head TSAD framework with a shared representation space and specialized heads, including the Generative head, the Discriminative head, and the Anomaly-Aware Contrastive head.
arXiv Detail & Related papers (2023-10-18T19:55:11Z) - Explainable Deep Few-shot Anomaly Detection with Deviation Networks [123.46611927225963]
We introduce a novel weakly-supervised anomaly detection framework to train detection models.
The proposed approach learns discriminative normality by leveraging the labeled anomalies and a prior probability.
Our model is substantially more sample-efficient and robust, and performs significantly better than state-of-the-art competing methods in both closed-set and open-set settings.
arXiv Detail & Related papers (2021-08-01T14:33:17Z) - Unsupervised Video Anomaly Detection via Normalizing Flows with Implicit
Latent Features [8.407188666535506]
Most existing methods use an autoencoder to learn to reconstruct normal videos.
We propose an implicit two-path AE (ITAE), a structure in which two encoders implicitly model appearance and motion features.
For the complex distribution of normal scenes, we suggest normal density estimation of ITAE features.
NF models intensify ITAE performance by learning normality through implicitly learned features.
arXiv Detail & Related papers (2020-10-15T05:02:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.