Related papers: Two-stream Multi-dimensional Convolutional Network for Real-time Violence Detection

Two-stream Multi-dimensional Convolutional Network for Real-time Violence Detection

URL: http://arxiv.org/abs/2211.04255v1
Date: Tue, 8 Nov 2022 14:04:47 GMT
Title: Two-stream Multi-dimensional Convolutional Network for Real-time Violence Detection
Authors: Dipon Kumar Ghosh and Amitabha Chakrabarty
Abstract summary: This work presents a novel architecture for violence detection called Two-stream Multi-dimensional Convolutional Network (2s-MDCN) Our proposed method extracts temporal and spatial information independently by 1D, 2D, and 3D convolutions. Our models obtained state-of-the-art accuracy of 89.7% on the largest violence detection benchmark dataset.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The increasing number of surveillance cameras and security concerns have made automatic violent activity detection from surveillance footage an active area for research. Modern deep learning methods have achieved good accuracy in violence detection and proved to be successful because of their applicability in intelligent surveillance systems. However, the models are computationally expensive and large in size because of their inefficient methods for feature extraction. This work presents a novel architecture for violence detection called Two-stream Multi-dimensional Convolutional Network (2s-MDCN), which uses RGB frames and optical flow to detect violence. Our proposed method extracts temporal and spatial information independently by 1D, 2D, and 3D convolutions. Despite combining multi-dimensional convolutional networks, our models are lightweight and efficient due to reduced channel capacity, yet they learn to extract meaningful spatial and temporal information. Additionally, combining RGB frames and optical flow yields 2.2% more accuracy than a single RGB stream. Regardless of having less complexity, our models obtained state-of-the-art accuracy of 89.7% on the largest violence detection benchmark dataset.

Related papers

Intelligent Image Sensing for Crime Analysis: A ML Approach towards Enhanced Violence Detection and Investigation [1.8219466405383231]
This paper introduces a comprehensive framework for violence detection and classification, employing Supervised Learning for both binary and multi-class violence classification.<n>Training is conducted on a diverse customized datasets with frame-level annotations, incorporating videos from surveillance cameras, human recordings, hockey fight, sohas and wvd dataset across various platforms.
arXiv Detail & Related papers (2025-06-16T18:39:16Z)
2D bidirectional gated recurrent unit convolutional Neural networks for end-to-end violence detection In videos [0.0]
We propose an architecture that combines a Bidirectional Gated Recurrent Unit (BiGRU) and a 2D Convolutional Neural Network (CNN) to detect violence in video sequences. A CNN is used to extract spatial characteristics from each frame, while the BiGRU extracts temporal and local motion characteristics using CNN extracted features from multiple frames.
arXiv Detail & Related papers (2024-09-11T19:36:12Z)
Violence detection in videos using deep recurrent and convolutional neural networks [0.0]
We propose a deep learning architecture for violence detection which combines both recurrent neural networks (RNNs) and 2-dimensional convolutional neural networks (2D CNN) In addition to video frames, we use optical flow computed using the captured sequences. The proposed approaches reach the same level as the state-of-the-art techniques and sometime surpass them.
arXiv Detail & Related papers (2024-09-11T19:21:51Z)
2D-Malafide: Adversarial Attacks Against Face Deepfake Detection Systems [8.717726409183175]
We introduce 2D-Malafide, a novel and lightweight adversarial attack designed to deceive face deepfake detection systems. Unlike traditional additive noise approaches, 2D-Malafide optimises a small number of filter coefficients to generate robust adversarial perturbations. Experiments, conducted using the FaceForensics++ dataset, demonstrate that 2D-Malafide substantially degrades detection performance in both white-box and black-box settings.
arXiv Detail & Related papers (2024-08-26T09:41:40Z)
Global Context Aggregation Network for Lightweight Saliency Detection of Surface Defects [70.48554424894728]
We develop a Global Context Aggregation Network (GCANet) for lightweight saliency detection of surface defects on the encoder-decoder structure. First, we introduce a novel transformer encoder on the top layer of the lightweight backbone, which captures global context information through a novel Depth-wise Self-Attention (DSA) module. The experimental results on three public defect datasets demonstrate that the proposed network achieves a better trade-off between accuracy and running efficiency compared with other 17 state-of-the-art methods.
arXiv Detail & Related papers (2023-09-22T06:19:11Z)
SALISA: Saliency-based Input Sampling for Efficient Video Object Detection [58.22508131162269]
We propose SALISA, a novel non-uniform SALiency-based Input SAmpling technique for video object detection. We show that SALISA significantly improves the detection of small objects.
arXiv Detail & Related papers (2022-04-05T17:59:51Z)
Real Time Action Recognition from Video Footage [0.5219568203653523]
Video surveillance cameras have added a new dimension to detect crime. This research focuses on integrating state-of-the-art Deep Learning methods to ensure a robust pipeline for autonomous surveillance for detecting violent activities.
arXiv Detail & Related papers (2021-12-13T07:27:41Z)
M2TR: Multi-modal Multi-scale Transformers for Deepfake Detection [74.19291916812921]
forged images generated by Deepfake techniques pose a serious threat to the trustworthiness of digital information. In this paper, we aim to capture the subtle manipulation artifacts at different scales for Deepfake detection. We introduce a high-quality Deepfake dataset, SR-DF, which consists of 4,000 DeepFake videos generated by state-of-the-art face swapping and facial reenactment methods.
arXiv Detail & Related papers (2021-04-20T05:43:44Z)
Efficient Two-Stream Network for Violence Detection Using Separable Convolutional LSTM [0.0]
We propose an efficient two-stream deep learning architecture leveraging Separable Convolutional LSTM (SepConvLSTM) and pre-trained MobileNet. SepConvLSTM is constructed by replacing convolution operation at each gate of ConvLSTM with a depthwise separable convolution. Our model outperforms the accuracy on the larger and more challenging RWF-2000 dataset by more than a 2% margin.
arXiv Detail & Related papers (2021-02-21T12:01:48Z)
Anchor-free Small-scale Multispectral Pedestrian Detection [88.7497134369344]
We propose a method for effective and efficient multispectral fusion of the two modalities in an adapted single-stage anchor-free base architecture. We aim at learning pedestrian representations based on object center and scale rather than direct bounding box predictions. Results show our method's effectiveness in detecting small-scaled pedestrians.
arXiv Detail & Related papers (2020-08-19T13:13:01Z)
Temporal Distinct Representation Learning for Action Recognition [139.93983070642412]
Two-Dimensional Convolutional Neural Network (2D CNN) is used to characterize videos. Different frames of a video share the same 2D CNN kernels, which may result in repeated and redundant information utilization. We propose a sequential channel filtering mechanism to excite the discriminative channels of features from different frames step by step, and thus avoid repeated information extraction. Our method is evaluated on benchmark temporal reasoning datasets Something-Something V1 and V2, and it achieves visible improvements over the best competitor by 2.4% and 1.3%, respectively.
arXiv Detail & Related papers (2020-07-15T11:30:40Z)
Depthwise Non-local Module for Fast Salient Object Detection Using a Single Thread [136.2224792151324]
We propose a new deep learning algorithm for fast salient object detection. The proposed algorithm achieves competitive accuracy and high inference efficiency simultaneously with a single CPU thread.
arXiv Detail & Related papers (2020-01-22T15:23:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.