Related papers: Violence detection in videos using deep recurrent and convolutional neural networks

Violence detection in videos using deep recurrent and convolutional neural networks

URL: http://arxiv.org/abs/2409.07581v1
Date: Wed, 11 Sep 2024 19:21:51 GMT
Title: Violence detection in videos using deep recurrent and convolutional neural networks
Authors: Abdarahmane Traoré, Moulay A. Akhloufi,
Abstract summary: We propose a deep learning architecture for violence detection which combines both recurrent neural networks (RNNs) and 2-dimensional convolutional neural networks (2D CNN) In addition to video frames, we use optical flow computed using the captured sequences. The proposed approaches reach the same level as the state-of-the-art techniques and sometime surpass them.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Violence and abnormal behavior detection research have known an increase of interest in recent years, due mainly to a rise in crimes in large cities worldwide. In this work, we propose a deep learning architecture for violence detection which combines both recurrent neural networks (RNNs) and 2-dimensional convolutional neural networks (2D CNN). In addition to video frames, we use optical flow computed using the captured sequences. CNN extracts spatial characteristics in each frame, while RNN extracts temporal characteristics. The use of optical flow allows to encode the movements in the scenes. The proposed approaches reach the same level as the state-of-the-art techniques and sometime surpass them. It was validated on 3 databases achieving good results.

Related papers

2D bidirectional gated recurrent unit convolutional Neural networks for end-to-end violence detection In videos [0.0]
We propose an architecture that combines a Bidirectional Gated Recurrent Unit (BiGRU) and a 2D Convolutional Neural Network (CNN) to detect violence in video sequences. A CNN is used to extract spatial characteristics from each frame, while the BiGRU extracts temporal and local motion characteristics using CNN extracted features from multiple frames.
arXiv Detail & Related papers (2024-09-11T19:36:12Z)
Fully Spiking Actor Network with Intra-layer Connections for Reinforcement Learning [51.386945803485084]
We focus on the task where the agent needs to learn multi-dimensional deterministic policies to control. Most existing spike-based RL methods take the firing rate as the output of SNNs, and convert it to represent continuous action space (i.e., the deterministic policy) through a fully-connected layer. To develop a fully spiking actor network without any floating-point matrix operations, we draw inspiration from the non-spiking interneurons found in insects.
arXiv Detail & Related papers (2024-01-09T07:31:34Z)
Two-stream Multi-dimensional Convolutional Network for Real-time Violence Detection [0.0]
This work presents a novel architecture for violence detection called Two-stream Multi-dimensional Convolutional Network (2s-MDCN) Our proposed method extracts temporal and spatial information independently by 1D, 2D, and 3D convolutions. Our models obtained state-of-the-art accuracy of 89.7% on the largest violence detection benchmark dataset.
arXiv Detail & Related papers (2022-11-08T14:04:47Z)
Spiking neural network for nonlinear regression [68.8204255655161]
Spiking neural networks carry the potential for a massive reduction in memory and energy consumption. They introduce temporal and neuronal sparsity, which can be exploited by next-generation neuromorphic hardware. A framework for regression using spiking neural networks is proposed.
arXiv Detail & Related papers (2022-10-06T13:04:45Z)
Detecting Violence in Video Based on Deep Features Fusion Technique [0.30458514384586394]
This work proposed a novel method to detect violence using a fusion tech-nique of two convolutional neural networks (CNNs) The performance of the proposed method is evaluated using three standard benchmark datasets in terms of detection accuracy.
arXiv Detail & Related papers (2022-04-15T12:51:20Z)
Visual Attention Network [90.0753726786985]
We propose a novel large kernel attention (LKA) module to enable self-adaptive and long-range correlations in self-attention. We also introduce a novel neural network based on LKA, namely Visual Attention Network (VAN) VAN outperforms the state-of-the-art vision transformers and convolutional neural networks with a large margin in extensive experiments.
arXiv Detail & Related papers (2022-02-20T06:35:18Z)
BreakingBED -- Breaking Binary and Efficient Deep Neural Networks by Adversarial Attacks [65.2021953284622]
We study robustness of CNNs against white-box and black-box adversarial attacks. Results are shown for distilled CNNs, agent-based state-of-the-art pruned models, and binarized neural networks.
arXiv Detail & Related papers (2021-03-14T20:43:19Z)
Anomaly Recognition from surveillance videos using 3D Convolutional Neural Networks [0.0]
Anomalous activity recognition deals with identifying the patterns and events that vary from the normal stream. This study provides a simple, yet effective approach for learning features using deep 3-dimensional convolutional networks (3D ConvNets) trained on the University of Central Florida (UCF) Crime video dataset.
arXiv Detail & Related papers (2021-01-04T16:32:48Z)
Continuous Emotion Recognition with Spatiotemporal Convolutional Neural Networks [82.54695985117783]
We investigate the suitability of state-of-the-art deep learning architectures for continuous emotion recognition using long video sequences captured in-the-wild. We have developed and evaluated convolutional recurrent neural networks combining 2D-CNNs and long short term-memory units, and inflated 3D-CNN models, which are built by inflating the weights of a pre-trained 2D-CNN model during fine-tuning.
arXiv Detail & Related papers (2020-11-18T13:42:05Z)
Neural Human Video Rendering by Learning Dynamic Textures and Rendering-to-Video Translation [99.64565200170897]
We propose a novel human video synthesis method by explicitly disentangling the learning of time-coherent fine-scale details from the embedding of the human in 2D screen space. We show several applications of our approach, such as human reenactment and novel view synthesis from monocular video, where we show significant improvement over the state of the art both qualitatively and quantitatively.
arXiv Detail & Related papers (2020-01-14T18:06:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.