Fight Detection from Still Images in the Wild
- URL: http://arxiv.org/abs/2111.08370v2
- Date: Wed, 17 Nov 2021 09:49:06 GMT
- Title: Fight Detection from Still Images in the Wild
- Authors: \c{S}eymanur Akt{\i}, Ferda Ofli, Muhammad Imran, Haz{\i}m Kemal
Ekenel
- Abstract summary: We propose a new dataset, named Social Media Fight Images (SMFI), comprising real-world images of fight actions.
Tests indicate that, as in the other computer vision problems, there exists a dataset bias for the fight recognition problem.
- Score: 13.95888515102339
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Detecting fights from still images shared on social media is an important
task required to limit the distribution of violent scenes in order to prevent
their negative effects. For this reason, in this study, we address the problem
of fight detection from still images collected from the web and social media.
We explore how well one can detect fights from just a single still image. We
also propose a new dataset, named Social Media Fight Images (SMFI), comprising
real-world images of fight actions. Results of the extensive experiments on the
proposed dataset show that fight actions can be recognized successfully from
still images. That is, even without exploiting the temporal information, it is
possible to detect fights with high accuracy by utilizing appearance only. We
also perform cross-dataset experiments to evaluate the representation capacity
of the collected dataset. These experiments indicate that, as in the other
computer vision problems, there exists a dataset bias for the fight recognition
problem. Although the methods achieve close to 100% accuracy when trained and
tested on the same fight dataset, the cross-dataset accuracies are
significantly lower, i.e., around 70% when more representative datasets are
used for training. SMFI dataset is found to be one of the two most
representative datasets among the utilized five fight datasets.
Related papers
- Visual Context-Aware Person Fall Detection [52.49277799455569]
We present a segmentation pipeline to semi-automatically separate individuals and objects in images.
Background objects such as beds, chairs, or wheelchairs can challenge fall detection systems, leading to false positive alarms.
We demonstrate that object-specific contextual transformations during training effectively mitigate this challenge.
arXiv Detail & Related papers (2024-04-11T19:06:36Z) - CLIPC8: Face liveness detection algorithm based on image-text pairs and
contrastive learning [3.90443799528247]
We propose a face liveness detection method based on image-text pairs and contrastive learning.
The proposed method is capable of effectively detecting specific liveness attack behaviors in certain scenarios.
It is also effective in detecting traditional liveness attack methods, such as printing photo attacks and screen remake attacks.
arXiv Detail & Related papers (2023-11-29T12:21:42Z) - Semi-Supervised Image Captioning by Adversarially Propagating Labeled
Data [95.0476489266988]
We present a novel data-efficient semi-supervised framework to improve the generalization of image captioning models.
Our proposed method trains a captioner to learn from a paired data and to progressively associate unpaired data.
Our extensive and comprehensive empirical results both on (1) image-based and (2) dense region-based captioning datasets followed by comprehensive analysis on the scarcely-paired dataset.
arXiv Detail & Related papers (2023-01-26T15:25:43Z) - Mitigating Representation Bias in Action Recognition: Algorithms and
Benchmarks [76.35271072704384]
Deep learning models perform poorly when applied to videos with rare scenes or objects.
We tackle this problem from two different angles: algorithm and dataset.
We show that the debiased representation can generalize better when transferred to other datasets and tasks.
arXiv Detail & Related papers (2022-09-20T00:30:35Z) - Stereoscopic Universal Perturbations across Different Architectures and
Datasets [60.021985610201156]
We study the effect of adversarial perturbations of images on deep stereo matching networks for the disparity estimation task.
We present a method to craft a single set of perturbations that, when added to any stereo image pair in a dataset, can fool a stereo network.
Our perturbations can increase D1-error (akin to fooling rate) of state-of-the-art stereo networks from 1% to as much as 87%.
arXiv Detail & Related papers (2021-12-12T02:11:31Z) - Free Lunch for Co-Saliency Detection: Context Adjustment [14.688461235328306]
We propose a "cost-free" group-cut-paste (GCP) procedure to leverage images from off-the-shelf saliency detection datasets and synthesize new samples.
We collect a novel dataset called Context Adjustment Training. The two variants of our dataset, i.e., CAT and CAT+, consist of 16,750 and 33,500 images, respectively.
arXiv Detail & Related papers (2021-08-04T14:51:37Z) - Deception Detection in Videos using the Facial Action Coding System [4.641678530055641]
In our approach, we extract facial action units using the facial action coding system which we use as parameters for training a deep learning model.
We specifically use long short-term memory (LSTM) which we trained using the real-life trial dataset.
We also tested cross-dataset validation using the Real-life trial dataset, the Silesian Deception dataset, and the Bag-of-lies Deception dataset.
arXiv Detail & Related papers (2021-05-28T08:10:21Z) - Stereo Matching by Self-supervision of Multiscopic Vision [65.38359887232025]
We propose a new self-supervised framework for stereo matching utilizing multiple images captured at aligned camera positions.
A cross photometric loss, an uncertainty-aware mutual-supervision loss, and a new smoothness loss are introduced to optimize the network.
Our model obtains better disparity maps than previous unsupervised methods on the KITTI dataset.
arXiv Detail & Related papers (2021-04-09T02:58:59Z) - Defense-friendly Images in Adversarial Attacks: Dataset and Metrics for
Perturbation Difficulty [28.79528737626505]
A dataset bias is a problem in adversarial machine learning, especially in the evaluation of defenses.
In this paper, we report for the first time, a class of robust images that are both resilient to attacks and that recover better than random images under adversarial attacks.
We propose three metrics to determine the proportion of robust images in a dataset and provide scoring to determine the dataset bias.
arXiv Detail & Related papers (2020-11-05T06:21:24Z) - Deep Traffic Sign Detection and Recognition Without Target Domain Real
Images [52.079665469286496]
We propose a novel database generation method that requires no real image from the target-domain, and (ii) templates of the traffic signs.
The method does not aim at overcoming the training with real data, but to be a compatible alternative when the real data is not available.
On large data sets, training with a fully synthetic data set almost matches the performance of training with a real one.
arXiv Detail & Related papers (2020-07-30T21:06:47Z) - The MAMe Dataset: On the relevance of High Resolution and Variable Shape
image properties [0.0]
We introduce the MAMe dataset, an image classification dataset with remarkable high resolution and variable shape properties.
The MAMe dataset contains thousands of artworks from three different museums.
arXiv Detail & Related papers (2020-07-27T17:13:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.