Discrete neural representations for explainable anomaly detection
- URL: http://arxiv.org/abs/2112.05585v1
- Date: Fri, 10 Dec 2021 14:56:58 GMT
- Title: Discrete neural representations for explainable anomaly detection
- Authors: Stanislaw Szymanowicz, James Charles, Roberto Cipolla
- Abstract summary: We show how to robustly detect anomalies without the use of object or action classifiers.
We also show how to improve the quality of saliency maps using a novel neural architecture for learning discrete representations of video.
- Score: 25.929134751869032
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: The aim of this work is to detect and automatically generate high-level
explanations of anomalous events in video. Understanding the cause of an
anomalous event is crucial as the required response is dependant on its nature
and severity. Recent works typically use object or action classifier to detect
and provide labels for anomalous events. However, this constrains detection
systems to a finite set of known classes and prevents generalisation to unknown
objects or behaviours. Here we show how to robustly detect anomalies without
the use of object or action classifiers yet still recover the high level reason
behind the event. We make the following contributions: (1) a method using
saliency maps to decouple the explanation of anomalous events from object and
action classifiers, (2) show how to improve the quality of saliency maps using
a novel neural architecture for learning discrete representations of video by
predicting future frames and (3) beat the state-of-the-art anomaly explanation
methods by 60\% on a subset of the public benchmark X-MAN dataset.
Related papers
- Weakly Supervised Video Anomaly Detection and Localization with Spatio-Temporal Prompts [57.01985221057047]
This paper introduces a novel method that learnstemporal prompt embeddings for weakly supervised video anomaly detection and localization (WSVADL) based on pre-trained vision-language models (VLMs)
Our method achieves state-of-theart performance on three public benchmarks for the WSVADL task.
arXiv Detail & Related papers (2024-08-12T03:31:29Z) - Towards Open-World Object-based Anomaly Detection via Self-Supervised Outlier Synthesis [15.748043194987075]
This work aims to bridge the gap by leveraging an open-world object detector and an OoD detector via virtual outlier.
Our approach empowers our overall object detector architecture to learn anomaly-aware feature representations without relying on class labels.
Our method establishes state-of-the-art performance on object-level anomaly detection, achieving an average recall score improvement of over 5.4% for natural images.
arXiv Detail & Related papers (2024-07-22T16:16:38Z) - Video Anomaly Detection via Spatio-Temporal Pseudo-Anomaly Generation : A Unified Approach [49.995833831087175]
This work proposes a novel method for generating generic Video-temporal PAs by inpainting a masked out region of an image.
In addition, we present a simple unified framework to detect real-world anomalies under the OCC setting.
Our method performs on par with other existing state-of-the-art PAs generation and reconstruction based methods under the OCC setting.
arXiv Detail & Related papers (2023-11-27T13:14:06Z) - Open-Vocabulary Video Anomaly Detection [57.552523669351636]
Video anomaly detection (VAD) with weak supervision has achieved remarkable performance in utilizing video-level labels to discriminate whether a video frame is normal or abnormal.
Recent studies attempt to tackle a more realistic setting, open-set VAD, which aims to detect unseen anomalies given seen anomalies and normal videos.
This paper takes a step further and explores open-vocabulary video anomaly detection (OVVAD), in which we aim to leverage pre-trained large models to detect and categorize seen and unseen anomalies.
arXiv Detail & Related papers (2023-11-13T02:54:17Z) - Spatio-temporal predictive tasks for abnormal event detection in videos [60.02503434201552]
We propose new constrained pretext tasks to learn object level normality patterns.
Our approach consists in learning a mapping between down-scaled visual queries and their corresponding normal appearance and motion characteristics.
Experiments on several benchmark datasets demonstrate the effectiveness of our approach to localize and track anomalies.
arXiv Detail & Related papers (2022-10-27T19:45:12Z) - Context Recovery and Knowledge Retrieval: A Novel Two-Stream Framework
for Video Anomaly Detection [48.05512963355003]
We propose a two-stream framework based on context recovery and knowledge retrieval.
For the context recovery stream, we propose a U-Net which can fully utilize the motion information to predict the future frame.
For the knowledge retrieval stream, we propose an improved learnable locality-sensitive hashing.
The knowledge about normality is encoded and stored in hash tables, and the distance between the testing event and the knowledge representation is used to reveal the probability of anomaly.
arXiv Detail & Related papers (2022-09-07T03:12:02Z) - Object-centric and memory-guided normality reconstruction for video
anomaly detection [56.64792194894702]
This paper addresses anomaly detection problem for videosurveillance.
Due to the inherent rarity and heterogeneity of abnormal events, the problem is viewed as a normality modeling strategy.
Our model learns object-centric normal patterns without seeing anomalous samples during training.
arXiv Detail & Related papers (2022-03-07T19:28:39Z) - X-MAN: Explaining multiple sources of anomalies in video [25.929134751869032]
We show how to build interpretable feature representations suitable for detecting anomalies in video.
We also propose an interpretable probabilistic anomaly detector which can describe the reason behind it's response.
Our method competes well with the state of the art on public datasets.
arXiv Detail & Related papers (2021-06-16T15:25:50Z) - A Background-Agnostic Framework with Adversarial Training for Abnormal
Event Detection in Video [120.18562044084678]
Abnormal event detection in video is a complex computer vision problem that has attracted significant attention in recent years.
We propose a background-agnostic framework that learns from training videos containing only normal events.
arXiv Detail & Related papers (2020-08-27T18:39:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.