ZoomCount: A Zooming Mechanism for Crowd Counting in Static Images
- URL: http://arxiv.org/abs/2002.12256v1
- Date: Thu, 27 Feb 2020 16:57:04 GMT
- Title: ZoomCount: A Zooming Mechanism for Crowd Counting in Static Images
- Authors: Usman Sajid, Hasan Sajid, Hongcheng Wang, Guanghui Wang
- Abstract summary: Current approaches cannot handle huge crowd diversity well and perform poorly in extreme cases.
The proposed solution is based on the observation that detecting and handling such extreme cases leads to better crowd estimation.
- Score: 22.387393675233124
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper proposes a novel approach for crowd counting in low to high
density scenarios in static images. Current approaches cannot handle huge crowd
diversity well and thus perform poorly in extreme cases, where the crowd
density in different regions of an image is either too low or too high, leading
to crowd underestimation or overestimation. The proposed solution is based on
the observation that detecting and handling such extreme cases in a specialized
way leads to better crowd estimation. Additionally, existing methods find it
hard to differentiate between the actual crowd and the cluttered background
regions, resulting in further count overestimation. To address these issues, we
propose a simple yet effective modular approach, where an input image is first
subdivided into fixed-size patches and then fed to a four-way classification
module labeling each image patch as low, medium, high-dense or no-crowd. This
module also provides a count for each label, which is then analyzed via a
specifically devised novel decision module to decide whether the image belongs
to any of the two extreme cases (very low or very high density) or a normal
case. Images, specified as high- or low-density extreme or a normal case, pass
through dedicated zooming or normal patch-making blocks respectively before
routing to the regressor in the form of fixed-size patches for crowd estimate.
Extensive experimental evaluations demonstrate that the proposed approach
outperforms the state-of-the-art methods on four benchmarks under most of the
evaluation criteria.
Related papers
- Learning to Rank Patches for Unbiased Image Redundancy Reduction [80.93989115541966]
Images suffer from heavy spatial redundancy because pixels in neighboring regions are spatially correlated.
Existing approaches strive to overcome this limitation by reducing less meaningful image regions.
We propose a self-supervised framework for image redundancy reduction called Learning to Rank Patches.
arXiv Detail & Related papers (2024-03-31T13:12:41Z) - Single Domain Generalization for Crowd Counting [11.212941297348268]
MPCount is a novel effective approach even for narrow source distribution.
It stores diverse density values for density map regression and reconstructs domain-invariant features by means of only one memory bank.
It is shown to significantly improve counting accuracy compared to the state of the art under diverse scenarios.
arXiv Detail & Related papers (2024-03-14T06:16:21Z) - Robust Zero-Shot Crowd Counting and Localization With Adaptive Resolution SAM [55.93697196726016]
We propose a simple yet effective crowd counting method by utilizing the Segment-Everything-Everywhere Model (SEEM)
We show that SEEM's performance in dense crowd scenes is limited, primarily due to the omission of many persons in high-density areas.
Our proposed method achieves the best unsupervised performance in crowd counting, while also being comparable to some supervised methods.
arXiv Detail & Related papers (2024-02-27T13:55:17Z) - Composed Image Retrieval with Text Feedback via Multi-grained
Uncertainty Regularization [73.04187954213471]
We introduce a unified learning approach to simultaneously modeling the coarse- and fine-grained retrieval.
The proposed method has achieved +4.03%, +3.38%, and +2.40% Recall@50 accuracy over a strong baseline.
arXiv Detail & Related papers (2022-11-14T14:25:40Z) - Boosting Few-shot Fine-grained Recognition with Background Suppression
and Foreground Alignment [53.401889855278704]
Few-shot fine-grained recognition (FS-FGR) aims to recognize novel fine-grained categories with the help of limited available samples.
We propose a two-stage background suppression and foreground alignment framework, which is composed of a background activation suppression (BAS) module, a foreground object alignment (FOA) module, and a local to local (L2L) similarity metric.
Experiments conducted on multiple popular fine-grained benchmarks demonstrate that our method outperforms the existing state-of-the-art by a large margin.
arXiv Detail & Related papers (2022-10-04T07:54:40Z) - Region-level Active Learning for Cluttered Scenes [60.93811392293329]
We introduce a new strategy that subsumes previous Image-level and Object-level approaches into a generalized, Region-level approach.
We show that this approach significantly decreases labeling effort and improves rare object search on realistic data with inherent class-imbalance and cluttered scenes.
arXiv Detail & Related papers (2021-08-20T14:02:38Z) - A Hierarchical Transformation-Discriminating Generative Model for Few
Shot Anomaly Detection [93.38607559281601]
We devise a hierarchical generative model that captures the multi-scale patch distribution of each training image.
The anomaly score is obtained by aggregating the patch-based votes of the correct transformation across scales and image regions.
arXiv Detail & Related papers (2021-04-29T17:49:48Z) - Multi-frame Super-resolution from Noisy Data [6.414055487487486]
We show the usefulness of two adaptive regularisers based on anisotropic diffusion ideas.
We also introduce a novel non-local one with one-sided differences and superior performance.
Surprisingly, the evaluation in a practically relevant noisy scenario produces a different ranking than the one in the noise-free setting.
arXiv Detail & Related papers (2021-03-25T12:07:08Z) - Plug-and-Play Rescaling Based Crowd Counting in Static Images [24.150701096083242]
We propose a new image patch rescaling module (PRM) and three independent PRM employed crowd counting methods.
The proposed frameworks use the PRM module to rescale the image regions (patches) that require special treatment, whereas the classification process helps in recognizing and discarding any cluttered crowd-like background regions which may result in overestimation.
arXiv Detail & Related papers (2020-01-06T21:43:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.