Analysis of Real-Time Hostile Activitiy Detection from Spatiotemporal
Features Using Time Distributed Deep CNNs, RNNs and Attention-Based
Mechanisms
- URL: http://arxiv.org/abs/2302.11027v1
- Date: Tue, 21 Feb 2023 22:02:39 GMT
- Title: Analysis of Real-Time Hostile Activitiy Detection from Spatiotemporal
Features Using Time Distributed Deep CNNs, RNNs and Attention-Based
Mechanisms
- Authors: Labib Ahmed Siddique, Rabita Junhai, Tanzim Reza, Salman Sayeed Khan,
and Tanvir Rahman
- Abstract summary: Real-time video surveillance, through CCTV camera systems has become essential for ensuring public safety.
Deep learning video classification techniques can help us automate surveillance systems to detect violence as it happens.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Real-time video surveillance, through CCTV camera systems has become
essential for ensuring public safety which is a priority today. Although CCTV
cameras help a lot in increasing security, these systems require constant human
interaction and monitoring. To eradicate this issue, intelligent surveillance
systems can be built using deep learning video classification techniques that
can help us automate surveillance systems to detect violence as it happens. In
this research, we explore deep learning video classification techniques to
detect violence as they are happening. Traditional image classification
techniques fall short when it comes to classifying videos as they attempt to
classify each frame separately for which the predictions start to flicker.
Therefore, many researchers are coming up with video classification techniques
that consider spatiotemporal features while classifying. However, deploying
these deep learning models with methods such as skeleton points obtained
through pose estimation and optical flow obtained through depth sensors, are
not always practical in an IoT environment. Although these techniques ensure a
higher accuracy score, they are computationally heavier. Keeping these
constraints in mind, we experimented with various video classification and
action recognition techniques such as ConvLSTM, LRCN (with both custom CNN
layers and VGG-16 as feature extractor) CNNTransformer and C3D. We achieved a
test accuracy of 80% on ConvLSTM, 83.33% on CNN-BiLSTM, 70% on VGG16-BiLstm
,76.76% on CNN-Transformer and 80% on C3D.
Related papers
- CCTV-Gun: Benchmarking Handgun Detection in CCTV Images [59.24281591714385]
Gun violence is a critical security problem, and it is imperative for the computer vision community to develop effective gun detection algorithms.
detecting guns in real-world CCTV images remains a challenging and under-explored task.
We present a benchmark, called textbfCCTV-Gun, which addresses the challenges of detecting handguns in real-world CCTV images.
arXiv Detail & Related papers (2023-03-19T16:17:35Z) - Detecting train driveshaft damages using accelerometer signals and
Differential Convolutional Neural Networks [67.60224656603823]
This paper proposes the development of a railway axle condition monitoring system based on advanced 2D-Convolutional Neural Network (CNN) architectures.
The resultant system converts the railway axle vibration signals into time-frequency domain representations, i.e., spectrograms, and, thus, trains a two-dimensional CNN to classify them depending on their cracks.
arXiv Detail & Related papers (2022-11-15T15:04:06Z) - Intelligent 3D Network Protocol for Multimedia Data Classification using
Deep Learning [0.0]
We implement Hybrid Deep Learning Architecture that combines STIP and 3D CNN features to enhance the performance of 3D videos effectively.
The results are compared with state-of-the-art frameworks from literature for action recognition on UCF101 with an accuracy of 95%.
arXiv Detail & Related papers (2022-07-23T12:24:52Z) - Real Time Action Recognition from Video Footage [0.5219568203653523]
Video surveillance cameras have added a new dimension to detect crime.
This research focuses on integrating state-of-the-art Deep Learning methods to ensure a robust pipeline for autonomous surveillance for detecting violent activities.
arXiv Detail & Related papers (2021-12-13T07:27:41Z) - Video Salient Object Detection via Contrastive Features and Attention
Modules [106.33219760012048]
We propose a network with attention modules to learn contrastive features for video salient object detection.
A co-attention formulation is utilized to combine the low-level and high-level features.
We show that the proposed method requires less computation, and performs favorably against the state-of-the-art approaches.
arXiv Detail & Related papers (2021-11-03T17:40:32Z) - Event and Activity Recognition in Video Surveillance for Cyber-Physical
Systems [0.0]
Long-term motion patterns alone play a pivotal role in the task of recognizing an event.
We show that the long-term motion patterns alone play a pivotal role in the task of recognizing an event.
Only the temporal features are exploited using a hybrid Convolutional Neural Network (CNN) + Recurrent Neural Network (RNN) architecture.
arXiv Detail & Related papers (2021-11-03T08:30:38Z) - Spatiotemporal Inconsistency Learning for DeepFake Video Detection [51.747219106855624]
We present a novel temporal modeling paradigm in TIM by exploiting the temporal difference over adjacent frames along with both horizontal and vertical directions.
And the ISM simultaneously utilizes the spatial information from SIM and temporal information from TIM to establish a more comprehensive spatial-temporal representation.
arXiv Detail & Related papers (2021-09-04T13:05:37Z) - Adversarially robust deepfake media detection using fused convolutional
neural network predictions [79.00202519223662]
Current deepfake detection systems struggle against unseen data.
We employ three different deep Convolutional Neural Network (CNN) models to classify fake and real images extracted from videos.
The proposed technique outperforms state-of-the-art models with 96.5% accuracy.
arXiv Detail & Related papers (2021-02-11T11:28:00Z) - Training Strategies and Data Augmentations in CNN-based DeepFake Video
Detection [17.696134665850447]
The accuracy of automated systems for face forgery detection in videos is still quite limited and generally biased toward the dataset used to design and train a specific detection system.
In this paper we analyze how different training strategies and data augmentation techniques affect CNN-based deepfake detectors when training and testing on the same dataset or across different datasets.
arXiv Detail & Related papers (2020-11-16T08:50:56Z) - A Real-time Action Representation with Temporal Encoding and Deep
Compression [115.3739774920845]
We propose a new real-time convolutional architecture, called Temporal Convolutional 3D Network (T-C3D), for action representation.
T-C3D learns video action representations in a hierarchical multi-granularity manner while obtaining a high process speed.
Our method achieves clear improvements on UCF101 action recognition benchmark against state-of-the-art real-time methods by 5.4% in terms of accuracy and 2 times faster in terms of inference speed with a less than 5MB storage model.
arXiv Detail & Related papers (2020-06-17T06:30:43Z) - An Information-rich Sampling Technique over Spatio-Temporal CNN for
Classification of Human Actions in Videos [5.414308305392762]
We propose a novel scheme for human action recognition in videos, using a 3-dimensional Convolutional Neural Network (3D CNN) based classifier.
In this paper, a 3D CNN architecture is proposed to extract featuresweighted and follows Long Short-Term Memory (LSTM) to recognize human actions.
Experiments are performed with KTH and WEIZMANN human actions datasets, whereby it is shown to produce comparable results with the state-of-the-art techniques.
arXiv Detail & Related papers (2020-02-06T05:07:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.