Adaptive Mixture Regression Network with Local Counting Map for Crowd
Counting
- URL: http://arxiv.org/abs/2005.05776v2
- Date: Wed, 13 May 2020 06:53:41 GMT
- Title: Adaptive Mixture Regression Network with Local Counting Map for Crowd
Counting
- Authors: Xiyang Liu, Jie Yang, Wenrui Ding
- Abstract summary: We introduce a new target, named local counting map (LCM), to obtain more accurate results than density map based approaches.
We also propose an adaptive mixture regression framework with three modules in a coarse-to-fine manner to further improve the precision of the crowd estimation.
- Score: 16.816382549827214
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The crowd counting task aims at estimating the number of people located in an
image or a frame from videos. Existing methods widely adopt density maps as the
training targets to optimize the point-to-point loss. While in testing phase,
we only focus on the differences between the crowd numbers and the global
summation of density maps, which indicate the inconsistency between the
training targets and the evaluation criteria. To solve this problem, we
introduce a new target, named local counting map (LCM), to obtain more accurate
results than density map based approaches. Moreover, we also propose an
adaptive mixture regression framework with three modules in a coarse-to-fine
manner to further improve the precision of the crowd estimation: scale-aware
module (SAM), mixture regression module (MRM) and adaptive soft interval module
(ASIM). Specifically, SAM fully utilizes the context and multi-scale
information from different convolutional features; MRM and ASIM perform more
precise counting regression on local patches of images. Compared with current
methods, the proposed method reports better performances on the typical
datasets. The source code is available at
https://github.com/xiyang1012/Local-Crowd-Counting.
Related papers
- DMESA: Densely Matching Everything by Segmenting Anything [16.16319526547664]
We propose MESA and DMESA as novel feature matching methods.
MESA establishes implicit-semantic area matching prior to point matching, based on advanced image understanding of SAM.
With less repetitive computation, DMESA showcases a speed improvement of nearly five times compared to MESA.
arXiv Detail & Related papers (2024-08-01T04:39:36Z) - Rethinking Spatial Invariance of Convolutional Networks for Object
Counting [119.83017534355842]
We try to use locally connected Gaussian kernels to replace the original convolution filter to estimate the spatial position in the density map.
Inspired by previous work, we propose a low-rank approximation accompanied with translation invariance to favorably implement the approximation of massive Gaussian convolution.
Our methods significantly outperform other state-of-the-art methods and achieve promising learning of the spatial position of objects.
arXiv Detail & Related papers (2022-06-10T17:51:25Z) - Adaptive Affinity for Associations in Multi-Target Multi-Camera Tracking [53.668757725179056]
We propose a simple yet effective approach to adapt affinity estimations to corresponding matching scopes in MTMCT.
Instead of trying to deal with all appearance changes, we tailor the affinity metric to specialize in ones that might emerge during data associations.
Minimizing the mismatch, the adaptive affinity module brings significant improvements over global re-ID distance.
arXiv Detail & Related papers (2021-12-14T18:59:11Z) - Reciprocal Distance Transform Maps for Crowd Counting and People
Localization in Dense Crowd [16.224760698133462]
We propose a novel Reciprocal Distance Transform (R-DT) map for crowd counting.
Compared with the density maps, the R-DT maps accurately describe the people location, without overlap between nearby heads in dense regions.
We simultaneously implement crowd counting and people localization with a simple network by replacing density maps with R-DT maps.
arXiv Detail & Related papers (2021-02-16T02:25:55Z) - Learning Independent Instance Maps for Crowd Localization [44.6430092887941]
We propose an end-to-end and straightforward framework for crowd localization, named Independent Instance Map segmentation (IIM)
IIM segment crowds into independent connected components, the positions and the crowd counts are obtained.
To improve the segmentation quality for different density regions, we present a differentiable Binarization Module (BM)
BM brings two advantages into localization models: 1) adaptively learn a threshold map for different images to detect each instance more accurately; 2) directly train the model using loss on binary predictions and labels.
arXiv Detail & Related papers (2020-12-08T02:17:19Z) - PSCNet: Pyramidal Scale and Global Context Guided Network for Crowd
Counting [44.306790250158954]
This paper proposes a novel crowd counting approach based on pyramidal scale module (PSM) and global context module (GCM)
PSM is used to adaptively capture multi-scale information, which can identify a fine boundary of crowds with different image scales.
GCM is devised with low-complexity and lightweight manner, to make the interactive information across the channels of the feature maps more efficient.
arXiv Detail & Related papers (2020-12-07T11:35:56Z) - A Strong Baseline for Crowd Counting and Unsupervised People
Localization [2.690502103971799]
We explore a strong baseline for crowd counting and an unsupervised people localization algorithm based on estimated density maps.
We collect different backbones and training tricks and evaluate the impact of changing them.
We propose a clustering algorithm named isolated KMeans to locate the heads in density maps.
arXiv Detail & Related papers (2020-11-07T08:29:03Z) - Distribution Matching for Crowd Counting [51.90971145453012]
We show that imposing Gaussians to annotations hurts generalization performance.
We propose to use Distribution Matching for crowd COUNTing (DM-Count)
In terms of Mean Absolute Error, DM-Count outperforms the previous state-of-the-art methods.
arXiv Detail & Related papers (2020-09-28T04:57:23Z) - Making Affine Correspondences Work in Camera Geometry Computation [62.7633180470428]
Local features provide region-to-region rather than point-to-point correspondences.
We propose guidelines for effective use of region-to-region matches in the course of a full model estimation pipeline.
Experiments show that affine solvers can achieve accuracy comparable to point-based solvers at faster run-times.
arXiv Detail & Related papers (2020-07-20T12:07:48Z) - JHU-CROWD++: Large-Scale Crowd Counting Dataset and A Benchmark Method [92.15895515035795]
We introduce a new large scale unconstrained crowd counting dataset (JHU-CROWD++) that contains "4,372" images with "1.51 million" annotations.
We propose a novel crowd counting network that progressively generates crowd density maps via residual error estimation.
arXiv Detail & Related papers (2020-04-07T14:59:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.