Violence Detection in Videos
- URL: http://arxiv.org/abs/2109.08941v1
- Date: Sat, 18 Sep 2021 14:33:40 GMT
- Title: Violence Detection in Videos
- Authors: Praveen Tirupattur, Christian Schulze, Andreas Dengel
- Abstract summary: A novel attempt is made to detect the category of violence present in a video.
A system which can automatically detect violence from both Hollywood movies and videos from the web is extremely useful.
- Score: 7.529847987644438
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In the recent years, there has been a tremendous increase in the amount of
video content uploaded to social networking and video sharing websites like
Facebook and Youtube. As of result of this, the risk of children getting
exposed to adult and violent content on the web also increased. To address this
issue, an approach to automatically detect violent content in videos is
proposed in this work. Here, a novel attempt is made also to detect the
category of violence present in a video. A system which can automatically
detect violence from both Hollywood movies and videos from the web is extremely
useful not only in parental control but also for applications related to movie
ratings, video surveillance, genre classification and so on.
Here, both audio and visual features are used to detect violence. MFCC
features are used as audio cues. Blood, Motion, and SentiBank features are used
as visual cues. Binary SVM classifiers are trained on each of these features to
detect violence. Late fusion using a weighted sum of classification scores is
performed to get final classification scores for each of the violence class
target by the system. To determine optimal weights for each of the violence
classes an approach based on grid search is employed. Publicly available
datasets, mainly Violent Scene Detection (VSD), are used for classifier
training, weight calculation, and testing. The performance of the system is
evaluated on two classification tasks, Multi-Class classification, and Binary
Classification. The results obtained for Binary Classification are better than
the baseline results from MediaEval-2014.
Related papers
- JOSENet: A Joint Stream Embedding Network for Violence Detection in Surveillance Videos [4.94659999696881]
We introduce JOSENet, a novel self-supervised framework for violence detection in surveillance videos.
JOSENet receives twotemporal video streams, i.e., RGB frames and optical flows, and involves a new regularized self-supervised learning approach for videos.
It provides improved performance compared to self-supervised state-of-the-art methods, while requiring one-fourth of the number of frames per video segment and a reduced frame rate.
arXiv Detail & Related papers (2024-05-05T15:01:00Z) - Text-to-feature diffusion for audio-visual few-shot learning [59.45164042078649]
Few-shot learning from video data is a challenging and underexplored, yet much cheaper, setup.
We introduce a unified audio-visual few-shot video classification benchmark on three datasets.
We show that AV-DIFF obtains state-of-the-art performance on our proposed benchmark for audio-visual few-shot learning.
arXiv Detail & Related papers (2023-09-07T17:30:36Z) - Verifying the Robustness of Automatic Credibility Assessment [79.08422736721764]
Text classification methods have been widely investigated as a way to detect content of low credibility.
In some cases insignificant changes in input text can mislead the models.
We introduce BODEGA: a benchmark for testing both victim models and attack methods on misinformation detection tasks.
arXiv Detail & Related papers (2023-03-14T16:11:47Z) - Temporal Saliency Query Network for Efficient Video Recognition [82.52760040577864]
Video recognition is a hot-spot research topic with the explosive growth of multimedia data on the Internet and mobile devices.
Most existing methods select the salient frames without awareness of the class-specific saliency scores.
We propose a novel Temporal Saliency Query (TSQ) mechanism, which introduces class-specific information to provide fine-grained cues for saliency measurement.
arXiv Detail & Related papers (2022-07-21T09:23:34Z) - Detecting Violence in Video Based on Deep Features Fusion Technique [0.30458514384586394]
This work proposed a novel method to detect violence using a fusion tech-nique of two convolutional neural networks (CNNs)
The performance of the proposed method is evaluated using three standard benchmark datasets in terms of detection accuracy.
arXiv Detail & Related papers (2022-04-15T12:51:20Z) - Adversarial Attacks on Deep Learning-based Video Compression and
Classification Systems [23.305818640220554]
We conduct the first systematic study for adversarial attacks on deep learning based video compression and downstream classification systems.
We propose an adaptive adversarial attack that can manipulate the Rate-Distortion relationship of a video compression model to achieve two adversarial goals.
We also devise novel objectives for targeted and untargeted attacks to a downstream video classification service.
arXiv Detail & Related papers (2022-03-18T22:42:20Z) - Multilevel profiling of situation and dialogue-based deep networks for
movie genre classification using movie trailers [7.904790547594697]
We propose a novel multi-modality: situation, dialogue, and metadata-based movie genre classification framework.
We develop the English movie trailer dataset (EMTD), which contains 2000 Hollywood movie trailers belonging to five popular genres.
arXiv Detail & Related papers (2021-09-14T07:33:56Z) - Cross-category Video Highlight Detection via Set-based Learning [55.49267044910344]
We propose a Dual-Learner-based Video Highlight Detection (DL-VHD) framework.
It learns the distinction of target category videos and the characteristics of highlight moments on source video category.
It outperforms five typical Unsupervised Domain Adaptation (UDA) algorithms on various cross-category highlight detection tasks.
arXiv Detail & Related papers (2021-08-26T13:06:47Z) - VideoMix: Rethinking Data Augmentation for Video Classification [29.923635550986997]
State-of-the-art video action classifiers often suffer from overfitting.
Recent data augmentation strategies have been reported to address the overfitting problems.
VideoMix lets a model learn beyond the object and scene biases and extract more robust cues for action recognition.
arXiv Detail & Related papers (2020-12-07T05:40:33Z) - A Unified Framework for Shot Type Classification Based on Subject
Centric Lens [89.26211834443558]
We propose a learning framework for shot type recognition using Subject Guidance Network (SGNet)
SGNet separates the subject and background of a shot into two streams, serving as separate guidance maps for scale and movement type classification respectively.
We build a large-scale dataset MovieShots, which contains 46K shots from 7K movie trailers with annotations of their scale and movement types.
arXiv Detail & Related papers (2020-08-08T15:49:40Z) - Generalized Few-Shot Video Classification with Video Retrieval and
Feature Generation [132.82884193921535]
We argue that previous methods underestimate the importance of video feature learning and propose a two-stage approach.
We show that this simple baseline approach outperforms prior few-shot video classification methods by over 20 points on existing benchmarks.
We present two novel approaches that yield further improvement.
arXiv Detail & Related papers (2020-07-09T13:05:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.