3D ResNet with Ranking Loss Function for Abnormal Activity Detection in
Videos
- URL: http://arxiv.org/abs/2002.01132v1
- Date: Tue, 4 Feb 2020 05:32:21 GMT
- Title: 3D ResNet with Ranking Loss Function for Abnormal Activity Detection in
Videos
- Authors: Shikha Dubey, Abhijeet Boragule, Moongu Jeon
- Abstract summary: This study is motivated by the recent state-of-art work of abnormal activity detection.
In the absence of temporal-annotations, such a model is prone to give a false alarm while detecting the abnormalities.
In this paper, we focus on the task of minimizing the false alarm rate while performing an abnormal activity detection task.
- Score: 6.692686655277163
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Abnormal activity detection is one of the most challenging tasks in the field
of computer vision. This study is motivated by the recent state-of-art work of
abnormal activity detection, which utilizes both abnormal and normal videos in
learning abnormalities with the help of multiple instance learning by providing
the data with video-level information. In the absence of temporal-annotations,
such a model is prone to give a false alarm while detecting the abnormalities.
For this reason, in this paper, we focus on the task of minimizing the false
alarm rate while performing an abnormal activity detection task. The mitigation
of these false alarms and recent advancement of 3D deep neural network in video
action recognition task collectively give us motivation to exploit the 3D
ResNet in our proposed method, which helps to extract spatial-temporal features
from the videos. Afterwards, using these features and deep multiple instance
learning along with the proposed ranking loss, our model learns to predict the
abnormality score at the video segment level. Therefore, our proposed method 3D
deep Multiple Instance Learning with ResNet (MILR) along with the new proposed
ranking loss function achieves the best performance on the UCF-Crime benchmark
dataset, as compared to other state-of-art methods. The effectiveness of our
proposed method is demonstrated on the UCF-Crime dataset.
Related papers
- Weakly Supervised Video Anomaly Detection and Localization with Spatio-Temporal Prompts [57.01985221057047]
This paper introduces a novel method that learnstemporal prompt embeddings for weakly supervised video anomaly detection and localization (WSVADL) based on pre-trained vision-language models (VLMs)
Our method achieves state-of-theart performance on three public benchmarks for the WSVADL task.
arXiv Detail & Related papers (2024-08-12T03:31:29Z) - Detection of Object Throwing Behavior in Surveillance Videos [8.841708075914353]
This paper proposes a solution for throwing action detection in surveillance videos using deep learning.
To address the use-case of our Smart City project, we first generate the novel public 'Throwing Action' dataset.
We compare the performance of different feature extractors for our anomaly detection method on the UCF-Crime and Throwing-Action datasets.
arXiv Detail & Related papers (2024-03-11T09:53:19Z) - Dynamic Erasing Network Based on Multi-Scale Temporal Features for
Weakly Supervised Video Anomaly Detection [103.92970668001277]
We propose a Dynamic Erasing Network (DE-Net) for weakly supervised video anomaly detection.
We first propose a multi-scale temporal modeling module, capable of extracting features from segments of varying lengths.
Then, we design a dynamic erasing strategy, which dynamically assesses the completeness of the detected anomalies.
arXiv Detail & Related papers (2023-12-04T09:40:11Z) - Open-Vocabulary Video Anomaly Detection [57.552523669351636]
Video anomaly detection (VAD) with weak supervision has achieved remarkable performance in utilizing video-level labels to discriminate whether a video frame is normal or abnormal.
Recent studies attempt to tackle a more realistic setting, open-set VAD, which aims to detect unseen anomalies given seen anomalies and normal videos.
This paper takes a step further and explores open-vocabulary video anomaly detection (OVVAD), in which we aim to leverage pre-trained large models to detect and categorize seen and unseen anomalies.
arXiv Detail & Related papers (2023-11-13T02:54:17Z) - Anomaly Crossing: A New Method for Video Anomaly Detection as
Cross-domain Few-shot Learning [32.0713939637202]
Video anomaly detection aims to identify abnormal events that occurred in videos.
Most previous approaches learn only from normal videos using unsupervised or semi-supervised methods.
We propose a new learning paradigm by making full use of both normal and abnormal videos for video anomaly detection.
arXiv Detail & Related papers (2021-12-12T20:49:38Z) - Anomaly Recognition from surveillance videos using 3D Convolutional
Neural Networks [0.0]
Anomalous activity recognition deals with identifying the patterns and events that vary from the normal stream.
This study provides a simple, yet effective approach for learning features using deep 3-dimensional convolutional networks (3D ConvNets) trained on the University of Central Florida (UCF) Crime video dataset.
arXiv Detail & Related papers (2021-01-04T16:32:48Z) - Anomaly Detection in Video via Self-Supervised and Multi-Task Learning [113.81927544121625]
Anomaly detection in video is a challenging computer vision problem.
In this paper, we approach anomalous event detection in video through self-supervised and multi-task learning at the object level.
arXiv Detail & Related papers (2020-11-15T10:21:28Z) - Self-trained Deep Ordinal Regression for End-to-End Video Anomaly
Detection [114.9714355807607]
We show that applying self-trained deep ordinal regression to video anomaly detection overcomes two key limitations of existing methods.
We devise an end-to-end trainable video anomaly detection approach that enables joint representation learning and anomaly scoring without manually labeled normal/abnormal data.
arXiv Detail & Related papers (2020-03-15T08:44:55Z) - ZSTAD: Zero-Shot Temporal Activity Detection [107.63759089583382]
We propose a novel task setting called zero-shot temporal activity detection (ZSTAD), where activities that have never been seen in training can still be detected.
We design an end-to-end deep network based on R-C3D as the architecture for this solution.
Experiments on both the THUMOS14 and the Charades datasets show promising performance in terms of detecting unseen activities.
arXiv Detail & Related papers (2020-03-12T02:40:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.