Anomaly Detection in Video via Self-Supervised and Multi-Task Learning
- URL: http://arxiv.org/abs/2011.07491v3
- Date: Fri, 10 Sep 2021 18:05:19 GMT
- Title: Anomaly Detection in Video via Self-Supervised and Multi-Task Learning
- Authors: Mariana-Iuliana Georgescu, Antonio Barbalau, Radu Tudor Ionescu, Fahad
Shahbaz Khan, Marius Popescu, Mubarak Shah
- Abstract summary: Anomaly detection in video is a challenging computer vision problem.
In this paper, we approach anomalous event detection in video through self-supervised and multi-task learning at the object level.
- Score: 113.81927544121625
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Anomaly detection in video is a challenging computer vision problem. Due to
the lack of anomalous events at training time, anomaly detection requires the
design of learning methods without full supervision. In this paper, we approach
anomalous event detection in video through self-supervised and multi-task
learning at the object level. We first utilize a pre-trained detector to detect
objects. Then, we train a 3D convolutional neural network to produce
discriminative anomaly-specific information by jointly learning multiple proxy
tasks: three self-supervised and one based on knowledge distillation. The
self-supervised tasks are: (i) discrimination of forward/backward moving
objects (arrow of time), (ii) discrimination of objects in
consecutive/intermittent frames (motion irregularity) and (iii) reconstruction
of object-specific appearance information. The knowledge distillation task
takes into account both classification and detection information, generating
large prediction discrepancies between teacher and student models when
anomalies occur. To the best of our knowledge, we are the first to approach
anomalous event detection in video as a multi-task learning problem,
integrating multiple self-supervised and knowledge distillation proxy tasks in
a single architecture. Our lightweight architecture outperforms the
state-of-the-art methods on three benchmarks: Avenue, ShanghaiTech and UCSD
Ped2. Additionally, we perform an ablation study demonstrating the importance
of integrating self-supervised learning and normality-specific distillation in
a multi-task learning setting.
Related papers
- Video Anomaly Detection using GAN [0.0]
This thesis study aims to offer the solution for this use case so that human resources won't be required to keep an eye out for any unusual activity in the surveillance system records.
We have developed a novel generative adversarial network (GAN) based anomaly detection model.
arXiv Detail & Related papers (2023-11-23T16:41:30Z) - Open-Vocabulary Video Anomaly Detection [57.552523669351636]
Video anomaly detection (VAD) with weak supervision has achieved remarkable performance in utilizing video-level labels to discriminate whether a video frame is normal or abnormal.
Recent studies attempt to tackle a more realistic setting, open-set VAD, which aims to detect unseen anomalies given seen anomalies and normal videos.
This paper takes a step further and explores open-vocabulary video anomaly detection (OVVAD), in which we aim to leverage pre-trained large models to detect and categorize seen and unseen anomalies.
arXiv Detail & Related papers (2023-11-13T02:54:17Z) - Prior Knowledge Guided Network for Video Anomaly Detection [1.389970629097429]
Video Anomaly Detection (VAD) involves detecting anomalous events in videos.
We propose a Prior Knowledge Guided Network(PKG-Net) for the VAD task.
arXiv Detail & Related papers (2023-09-04T15:57:07Z) - Generalized Few-Shot 3D Object Detection of LiDAR Point Cloud for
Autonomous Driving [91.39625612027386]
We propose a novel task, called generalized few-shot 3D object detection, where we have a large amount of training data for common (base) objects, but only a few data for rare (novel) classes.
Specifically, we analyze in-depth differences between images and point clouds, and then present a practical principle for the few-shot setting in the 3D LiDAR dataset.
To solve this task, we propose an incremental fine-tuning method to extend existing 3D detection models to recognize both common and rare objects.
arXiv Detail & Related papers (2023-02-08T07:11:36Z) - Multi-Task Learning of Object State Changes from Uncurated Videos [55.60442251060871]
We learn to temporally localize object state changes by observing people interacting with objects in long uncurated web videos.
We show that our multi-task model achieves a relative improvement of 40% over the prior single-task methods.
We also test our method on long egocentric videos of the EPIC-KITCHENS and the Ego4D datasets in a zero-shot setup.
arXiv Detail & Related papers (2022-11-24T09:42:46Z) - Video Anomaly Detection by Solving Decoupled Spatio-Temporal Jigsaw
Puzzles [67.39567701983357]
Video Anomaly Detection (VAD) is an important topic in computer vision.
Motivated by the recent advances in self-supervised learning, this paper addresses VAD by solving an intuitive yet challenging pretext task.
Our method outperforms state-of-the-art counterparts on three public benchmarks.
arXiv Detail & Related papers (2022-07-20T19:49:32Z) - SSMTL++: Revisiting Self-Supervised Multi-Task Learning for Video
Anomaly Detection [108.57862846523858]
We revisit the self-supervised multi-task learning framework, proposing several updates to the original method.
We modernize the 3D convolutional backbone by introducing multi-head self-attention modules.
In our attempt to further improve the model, we study additional self-supervised learning tasks, such as predicting segmentation maps.
arXiv Detail & Related papers (2022-07-16T19:25:41Z) - Anomaly Recognition from surveillance videos using 3D Convolutional
Neural Networks [0.0]
Anomalous activity recognition deals with identifying the patterns and events that vary from the normal stream.
This study provides a simple, yet effective approach for learning features using deep 3-dimensional convolutional networks (3D ConvNets) trained on the University of Central Florida (UCF) Crime video dataset.
arXiv Detail & Related papers (2021-01-04T16:32:48Z) - 3D ResNet with Ranking Loss Function for Abnormal Activity Detection in
Videos [6.692686655277163]
This study is motivated by the recent state-of-art work of abnormal activity detection.
In the absence of temporal-annotations, such a model is prone to give a false alarm while detecting the abnormalities.
In this paper, we focus on the task of minimizing the false alarm rate while performing an abnormal activity detection task.
arXiv Detail & Related papers (2020-02-04T05:32:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.