Object Class Aware Video Anomaly Detection through Image Translation
- URL: http://arxiv.org/abs/2205.01706v1
- Date: Tue, 3 May 2022 18:04:27 GMT
- Title: Object Class Aware Video Anomaly Detection through Image Translation
- Authors: Mohammad Baradaran, Robert Bergevin
- Abstract summary: This paper proposes a novel two-stream object-aware VAD method that learns the normal appearance and motion patterns through image translation tasks.
The results show that, as significant improvements to previous methods, detections by our method are completely explainable and anomalies are localized accurately in the frames.
- Score: 1.2944868613449219
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Semi-supervised video anomaly detection (VAD) methods formulate the task of
anomaly detection as detection of deviations from the learned normal patterns.
Previous works in the field (reconstruction or prediction-based methods) suffer
from two drawbacks: 1) They focus on low-level features, and they (especially
holistic approaches) do not effectively consider the object classes. 2)
Object-centric approaches neglect some of the context information (such as
location). To tackle these challenges, this paper proposes a novel two-stream
object-aware VAD method that learns the normal appearance and motion patterns
through image translation tasks. The appearance branch translates the input
image to the target semantic segmentation map produced by Mask-RCNN, and the
motion branch associates each frame with its expected optical flow magnitude.
Any deviation from the expected appearance or motion in the inference stage
shows the degree of potential abnormality. We evaluated our proposed method on
the ShanghaiTech, UCSD-Ped1, and UCSD-Ped2 datasets and the results show
competitive performance compared with state-of-the-art works. Most importantly,
the results show that, as significant improvements to previous methods,
detections by our method are completely explainable and anomalies are localized
accurately in the frames.
Related papers
- Weakly Supervised Video Anomaly Detection and Localization with Spatio-Temporal Prompts [57.01985221057047]
This paper introduces a novel method that learnstemporal prompt embeddings for weakly supervised video anomaly detection and localization (WSVADL) based on pre-trained vision-language models (VLMs)
Our method achieves state-of-theart performance on three public benchmarks for the WSVADL task.
arXiv Detail & Related papers (2024-08-12T03:31:29Z) - UniForensics: Face Forgery Detection via General Facial Representation [60.5421627990707]
High-level semantic features are less susceptible to perturbations and not limited to forgery-specific artifacts, thus having stronger generalization.
We introduce UniForensics, a novel deepfake detection framework that leverages a transformer-based video network, with a meta-functional face classification for enriched facial representation.
arXiv Detail & Related papers (2024-07-26T20:51:54Z) - Dual-Image Enhanced CLIP for Zero-Shot Anomaly Detection [58.228940066769596]
We introduce a Dual-Image Enhanced CLIP approach, leveraging a joint vision-language scoring system.
Our methods process pairs of images, utilizing each as a visual reference for the other, thereby enriching the inference process with visual context.
Our approach significantly exploits the potential of vision-language joint anomaly detection and demonstrates comparable performance with current SOTA methods across various datasets.
arXiv Detail & Related papers (2024-05-08T03:13:20Z) - Self-supervised Feature Adaptation for 3D Industrial Anomaly Detection [59.41026558455904]
We focus on multi-modal anomaly detection. Specifically, we investigate early multi-modal approaches that attempted to utilize models pre-trained on large-scale visual datasets.
We propose a Local-to-global Self-supervised Feature Adaptation (LSFA) method to finetune the adaptors and learn task-oriented representation toward anomaly detection.
arXiv Detail & Related papers (2024-01-06T07:30:41Z) - Video Anomaly Detection via Spatio-Temporal Pseudo-Anomaly Generation : A Unified Approach [49.995833831087175]
This work proposes a novel method for generating generic Video-temporal PAs by inpainting a masked out region of an image.
In addition, we present a simple unified framework to detect real-world anomalies under the OCC setting.
Our method performs on par with other existing state-of-the-art PAs generation and reconstruction based methods under the OCC setting.
arXiv Detail & Related papers (2023-11-27T13:14:06Z) - Future Video Prediction from a Single Frame for Video Anomaly Detection [0.38073142980732994]
Video anomaly detection (VAD) is an important but challenging task in computer vision.
We introduce the task of future frame prediction proxy-task, as a novel proxy-task for video anomaly detection.
This proxy-task alleviates the challenges of previous methods in learning longer motion patterns.
arXiv Detail & Related papers (2023-08-15T14:04:50Z) - SOOD: Towards Semi-Supervised Oriented Object Detection [57.05141794402972]
This paper proposes a novel Semi-supervised Oriented Object Detection model, termed SOOD, built upon the mainstream pseudo-labeling framework.
Our experiments show that when trained with the two proposed losses, SOOD surpasses the state-of-the-art SSOD methods under various settings on the DOTA-v1.5 benchmark.
arXiv Detail & Related papers (2023-04-10T11:10:42Z) - UN-AVOIDS: Unsupervised and Nonparametric Approach for Visualizing
Outliers and Invariant Detection Scoring [2.578242050187029]
UN-AVOIDS is an unsupervised and nonparametric approach for both visualization (a human process) and detection (an algorithmic process) of outliers.
It transforms data into a new space, which is introduced in this paper as neighborhood cumulative density function (NCDF)
In terms of AUC, UN-AVOIDS was almost an overall winner.
arXiv Detail & Related papers (2021-11-19T02:31:06Z) - Occlusion-Robust Object Pose Estimation with Holistic Representation [42.27081423489484]
State-of-the-art (SOTA) object pose estimators take a two-stage approach.
We develop a novel occlude-and-blackout batch augmentation technique.
We also develop a multi-precision supervision architecture to encourage holistic pose representation learning.
arXiv Detail & Related papers (2021-10-22T08:00:26Z) - Local Anomaly Detection in Videos using Object-Centric Adversarial
Learning [12.043574473965318]
We propose a two-stage object-centric adversarial framework that only needs object regions for detecting frame-level local anomalies in videos.
The first stage consists in learning the correspondence between the current appearance and past gradient images of objects in scenes deemed normal, allowing us to either generate the past gradient from current appearance or the reverse.
The second stage extracts the partial reconstruction errors between real and generated images (appearance and past gradient) with normal object behaviour, and trains a discriminator in an adversarial fashion.
arXiv Detail & Related papers (2020-11-13T02:02:37Z) - Interpolation-based semi-supervised learning for object detection [44.37685664440632]
We propose an Interpolation-based Semi-supervised learning method for object detection.
The proposed losses dramatically improve the performance of semi-supervised learning as well as supervised learning.
arXiv Detail & Related papers (2020-06-03T10:53:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.