Detecting Events in Crowds Through Changes in Geometrical Dimensions of
Pedestrians
- URL: http://arxiv.org/abs/2312.06495v1
- Date: Mon, 11 Dec 2023 16:18:56 GMT
- Title: Detecting Events in Crowds Through Changes in Geometrical Dimensions of
Pedestrians
- Authors: Matheus Schreiner Homrich da Silva, Paulo Brossard de Souza Pinto
Neto, Rodolfo Migon Favaretto, Soraia Raupp Musse
- Abstract summary: We examine three different scenarios of crowd behavior, containing both the cases where an event triggers a change in the behavior of the crowd and two video sequences where the crowd and its motion remain mostly unchanged.
With both the videos and the tracking of the individual pedestrians (performed in a pre-processed phase), we use Geomind to extract significant data about the scene, in particular, the geometrical features, personalities, and emotions of each person.
We then examine the output, seeking a significant change in the way each person acts as a function of the time, that could be used as a basis to identify events or to model realistic crowd
- Score: 0.6390468088226495
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Security is an important topic in our contemporary world, and the ability to
automate the detection of any events of interest that can take place in a crowd
is of great interest to a population. We hypothesize that the detection of
events in videos is correlated with significant changes in pedestrian
behaviors. In this paper, we examine three different scenarios of crowd
behavior, containing both the cases where an event triggers a change in the
behavior of the crowd and two video sequences where the crowd and its motion
remain mostly unchanged. With both the videos and the tracking of the
individual pedestrians (performed in a pre-processed phase), we use Geomind, a
software we developed to extract significant data about the scene, in
particular, the geometrical features, personalities, and emotions of each
person. We then examine the output, seeking a significant change in the way
each person acts as a function of the time, that could be used as a basis to
identify events or to model realistic crowd actions. When applied to the games
area, our method can use the detected events to find some sort of pattern to be
then used in agent simulation. Results indicate that our hypothesis seems valid
in the sense that the visually observed events could be automatically detected
using GeoMind.
Related papers
- No-audio speaking status detection in crowded settings via visual
pose-based filtering and wearable acceleration [8.710774926703321]
Video and wearable sensors make it possible recognize speaking in an unobtrusive, privacy-preserving way.
We show that the selection of local features around pose keypoints has a positive effect on generalization performance.
We additionally make use of acceleration measured through wearable sensors for the same task, and present a multimodal approach combining both methods.
arXiv Detail & Related papers (2022-11-01T15:55:48Z) - Audio-visual Representation Learning for Anomaly Events Detection in
Crowds [119.72951028190586]
This paper attempts to exploit multi-modal learning for modeling the audio and visual signals simultaneously.
We conduct the experiments on SHADE dataset, a synthetic audio-visual dataset in surveillance scenes.
We find introducing audio signals effectively improves the performance of anomaly events detection and outperforms other state-of-the-art methods.
arXiv Detail & Related papers (2021-10-28T02:42:48Z) - JOKR: Joint Keypoint Representation for Unsupervised Cross-Domain Motion
Retargeting [53.28477676794658]
unsupervised motion in videos has seen substantial advancements through the use of deep neural networks.
We introduce JOKR - a JOint Keypoint Representation that handles both the source and target videos, without requiring any object prior or data collection.
We evaluate our method both qualitatively and quantitatively, and demonstrate that our method handles various cross-domain scenarios, such as different animals, different flowers, and humans.
arXiv Detail & Related papers (2021-06-17T17:32:32Z) - Affect2MM: Affective Analysis of Multimedia Content Using Emotion
Causality [84.69595956853908]
We present Affect2MM, a learning method for time-series emotion prediction for multimedia content.
Our goal is to automatically capture the varying emotions depicted by characters in real-life human-centric situations and behaviors.
arXiv Detail & Related papers (2021-03-11T09:07:25Z) - Toward Accurate Person-level Action Recognition in Videos of Crowded
Scenes [131.9067467127761]
We focus on improving the action recognition by fully-utilizing the information of scenes and collecting new data.
Specifically, we adopt a strong human detector to detect spatial location of each frame.
We then apply action recognition models to learn thetemporal information from video frames on both the HIE dataset and new data with diverse scenes from the internet.
arXiv Detail & Related papers (2020-10-16T13:08:50Z) - A Background-Agnostic Framework with Adversarial Training for Abnormal
Event Detection in Video [120.18562044084678]
Abnormal event detection in video is a complex computer vision problem that has attracted significant attention in recent years.
We propose a background-agnostic framework that learns from training videos containing only normal events.
arXiv Detail & Related papers (2020-08-27T18:39:24Z) - Tracking in Crowd is Challenging: Analyzing Crowd based on Physical
Characteristics [0.0]
Event detection method is developed to identify abnormal behavior intelligently.
The problem is very challenging due to high crowd density in different areas.
We consider a novel method to deal with these challenges.
arXiv Detail & Related papers (2020-08-08T22:42:25Z) - Human in Events: A Large-Scale Benchmark for Human-centric Video
Analysis in Complex Events [106.19047816743988]
We present a new large-scale dataset with comprehensive annotations, named Human-in-Events or HiEve.
It contains a record number of poses (>1M), the largest number of action instances (>56k) under complex events, as well as one of the largest numbers of trajectories lasting for longer time.
Based on its diverse annotation, we present two simple baselines for action recognition and pose estimation.
arXiv Detail & Related papers (2020-05-09T18:24:52Z) - Contextual Sense Making by Fusing Scene Classification, Detections, and
Events in Full Motion Video [0.7348448478819135]
We aim to address the needs of human analysts to consume and exploit data given aerial FMV.
We have divided the problem into three tasks: (1) Context awareness, (2) object cataloging, and (3) event detection.
We have applied our methods on data from different sensors at different resolutions in a variety of geographical areas.
arXiv Detail & Related papers (2020-01-16T18:26:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.