A Study of Human Gaze Behavior During Visual Crowd Counting
- URL: http://arxiv.org/abs/2009.06502v2
- Date: Sun, 27 Sep 2020 19:47:50 GMT
- Title: A Study of Human Gaze Behavior During Visual Crowd Counting
- Authors: Raji Annadi, Yupei Chen, Viresh Ranjan, Dimitris Samaras, Gregory
Zelinsky, Minh Hoai
- Abstract summary: Using an eye tracker, we collect gaze behavior of human participants tasked with counting the number of people in crowd images.
We observe some common approaches for visual counting.
In terms of count accuracy, our human participants are not as good at the counting task, compared to the performance of the current state-of-the-art computer algorithms.
- Score: 40.59955546629333
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we describe our study on how humans allocate their attention
during visual crowd counting. Using an eye tracker, we collect gaze behavior of
human participants who are tasked with counting the number of people in crowd
images. Analyzing the collected gaze behavior of ten human participants on
thirty crowd images, we observe some common approaches for visual counting. For
an image of a small crowd, the approach is to enumerate over all people or
groups of people in the crowd, and this explains the high level of similarity
between the fixation density maps of different human participants. For an image
of a large crowd, our participants tend to focus on one section of the image,
count the number of people in that section, and then extrapolate to the other
sections. In terms of count accuracy, our human participants are not as good at
the counting task, compared to the performance of the current state-of-the-art
computer algorithms. Interestingly, there is a tendency to under count the
number of people in all crowd images. Gaze behavior data and images can be
downloaded from
https://www3.cs.stonybrook.edu/~minhhoai/projects/crowd_counting_gaze/.
Related papers
- Evaluating Multiview Object Consistency in Humans and Image Models [68.36073530804296]
We leverage an experimental design from the cognitive sciences which requires zero-shot visual inferences about object shape.
We collect 35K trials of behavioral data from over 500 participants.
We then evaluate the performance of common vision models.
arXiv Detail & Related papers (2024-09-09T17:59:13Z) - Do humans and Convolutional Neural Networks attend to similar areas
during scene classification: Effects of task and image type [0.0]
We investigated how the tasks used to elicit human attention maps interact with image characteristics in modulating the similarity between humans and CNN.
We varied the type of image to be categorized, using either singular, salient objects, indoor scenes consisting of object arrangements, or landscapes without distinct objects defining the category.
The influence of human tasks strongly depended on image type: For objects, human manual selection produced maps that were most similar to CNN, while the specific eye movement task has little impact.
arXiv Detail & Related papers (2023-07-25T09:02:29Z) - Region-Aware Network: Model Human's Top-Down Visual Perception Mechanism
for Crowd Counting [33.09330894823192]
Background noise and scale variation are common problems that have been long recognized in crowd counting.
We propose a novel feedback network with Region-Aware block called RANet by modeling human's Top-Down visual perception mechanism.
Our method outperforms state-of-the-art crowd counting methods on several public datasets.
arXiv Detail & Related papers (2021-06-23T05:11:58Z) - Gaze Perception in Humans and CNN-Based Model [66.89451296340809]
We compare how a CNN (convolutional neural network) based model of gaze and humans infer the locus of attention in images of real-world scenes.
We show that compared to the model, humans' estimates of the locus of attention are more influenced by the context of the scene.
arXiv Detail & Related papers (2021-04-17T04:52:46Z) - Counting People by Estimating People Flows [135.85747920798897]
We advocate estimating people flows across image locations between consecutive images instead of directly regressing them.
It significantly boosts performance without requiring a more complex architecture.
We also show that leveraging people conservation constraints in both a spatial and temporal manner makes it possible to train a deep crowd counting model.
arXiv Detail & Related papers (2020-12-01T12:59:24Z) - Fine-Grained Crowd Counting [59.63412475367119]
Current crowd counting algorithms are only concerned with the number of people in an image.
We propose fine-grained crowd counting, which differentiates a crowd into categories based on the low-level behavior attributes of the individuals.
arXiv Detail & Related papers (2020-07-13T01:31:12Z) - Shallow Feature Based Dense Attention Network for Crowd Counting [103.67446852449551]
We propose a Shallow feature based Dense Attention Network (SDANet) for crowd counting from still images.
Our method outperforms other existing methods by a large margin, as is evident from a remarkable 11.9% Mean Absolute Error (MAE) drop of our SDANet.
arXiv Detail & Related papers (2020-06-17T13:34:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.