Learning Independent Instance Maps for Crowd Localization
- URL: http://arxiv.org/abs/2012.04164v2
- Date: Mon, 22 Mar 2021 03:20:09 GMT
- Title: Learning Independent Instance Maps for Crowd Localization
- Authors: Junyu Gao, Tao Han, Yuan Yuan, Qi Wang
- Abstract summary: We propose an end-to-end and straightforward framework for crowd localization, named Independent Instance Map segmentation (IIM)
IIM segment crowds into independent connected components, the positions and the crowd counts are obtained.
To improve the segmentation quality for different density regions, we present a differentiable Binarization Module (BM)
BM brings two advantages into localization models: 1) adaptively learn a threshold map for different images to detect each instance more accurately; 2) directly train the model using loss on binary predictions and labels.
- Score: 44.6430092887941
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Accurately locating each head's position in the crowd scenes is a crucial
task in the field of crowd analysis. However, traditional density-based methods
only predict coarse prediction, and segmentation/detection-based methods cannot
handle extremely dense scenes and large-range scale-variations crowds. To this
end, we propose an end-to-end and straightforward framework for crowd
localization, named Independent Instance Map segmentation (IIM). Different from
density maps and boxes regression, each instance in IIM is non-overlapped. By
segmenting crowds into independent connected components, the positions and the
crowd counts (the centers and the number of components, respectively) are
obtained. Furthermore, to improve the segmentation quality for different
density regions, we present a differentiable Binarization Module (BM) to output
structured instance maps. BM brings two advantages into localization models: 1)
adaptively learn a threshold map for different images to detect each instance
more accurately; 2) directly train the model using loss on binary predictions
and labels. Extensive experiments verify the proposed method is effective and
outperforms the-state-of-the-art methods on the five popular crowd datasets.
Significantly, IIM improves F1-measure by 10.4\% on the NWPU-Crowd Localization
task. The source code and pre-trained models will be released at
\url{https://github.com/taohan10200/IIM}.
Related papers
- Bidirectional Domain Mixup for Domain Adaptive Semantic Segmentation [73.3083304858763]
This paper systematically studies the impact of mixup under the domain adaptaive semantic segmentation task.
In specific, we achieve domain mixup in two-step: cut and paste.
We provide extensive ablation experiments to empirically verify our main components of the framework.
arXiv Detail & Related papers (2023-03-17T05:22:44Z) - Divide and Contrast: Source-free Domain Adaptation via Adaptive
Contrastive Learning [122.62311703151215]
Divide and Contrast (DaC) aims to connect the good ends of both worlds while bypassing their limitations.
DaC divides the target data into source-like and target-specific samples, where either group of samples is treated with tailored goals.
We further align the source-like domain with the target-specific samples using a memory bank-based Maximum Mean Discrepancy (MMD) loss to reduce the distribution mismatch.
arXiv Detail & Related papers (2022-11-12T09:21:49Z) - Personalizing Federated Medical Image Segmentation via Local Calibration [9.171482226385551]
Using a single model to adapt to various data distributions from different sites is extremely challenging.
We propose a personalized federated framework with textbfLocal textbfCalibration (LC-Fed) to leverage the inter-site in-consistencies.
Our method consistently shows superior performance to the state-of-the-art personalized FL methods.
arXiv Detail & Related papers (2022-07-11T06:30:31Z) - Decoupling Predictions in Distributed Learning for Multi-Center Left
Atrial MRI Segmentation [20.20518948616193]
We propose a new framework of distributed learning that bridges the gap between two groups, and improves the performance for both generic and local data.
Results on multi-center left atrial (LA) MRI segmentation showed that our method demonstrated superior performance over existing methods on both generic and local data.
arXiv Detail & Related papers (2022-06-10T08:35:42Z) - PANet: Perspective-Aware Network with Dynamic Receptive Fields and
Self-Distilling Supervision for Crowd Counting [63.84828478688975]
We propose a novel perspective-aware approach called PANet to address the perspective problem.
Based on the observation that the size of the objects varies greatly in one image due to the perspective effect, we propose the dynamic receptive fields (DRF) framework.
The framework is able to adjust the receptive field by the dilated convolution parameters according to the input image, which helps the model to extract more discriminative features for each local region.
arXiv Detail & Related papers (2021-10-31T04:43:05Z) - LDC-Net: A Unified Framework for Localization, Detection and Counting in
Dense Crowds [103.8635206945196]
The rapid development in visual crowd analysis shows a trend to count people by positioning or even detecting, rather than simply summing a density map.
Some recent work on crowd localization and detection has two limitations: 1) The typical detection methods can not handle the dense crowds and a large variation in scale; 2) The density map methods suffer from performance deficiency in position and box prediction, especially in high density or large-size crowds.
arXiv Detail & Related papers (2021-10-10T07:55:44Z) - PGL: Prior-Guided Local Self-supervised Learning for 3D Medical Image
Segmentation [87.50205728818601]
We propose a PriorGuided Local (PGL) self-supervised model that learns the region-wise local consistency in the latent feature space.
Our PGL model learns the distinctive representations of local regions, and hence is able to retain structural information.
arXiv Detail & Related papers (2020-11-25T11:03:11Z) - A Strong Baseline for Crowd Counting and Unsupervised People
Localization [2.690502103971799]
We explore a strong baseline for crowd counting and an unsupervised people localization algorithm based on estimated density maps.
We collect different backbones and training tricks and evaluate the impact of changing them.
We propose a clustering algorithm named isolated KMeans to locate the heads in density maps.
arXiv Detail & Related papers (2020-11-07T08:29:03Z) - MetaBox+: A new Region Based Active Learning Method for Semantic
Segmentation using Priority Maps [4.396860522241306]
We present a novel active learning method for semantic image segmentation, called MetaBox+.
For acquisition, we train a meta regression model to estimate the segment-wise Intersection over Union (IoU) of each predicted segment of unlabeled images.
We compare our method to entropy based methods, where we consider the entropy as uncertainty of the prediction.
arXiv Detail & Related papers (2020-10-05T09:36:47Z) - Adaptive Mixture Regression Network with Local Counting Map for Crowd
Counting [16.816382549827214]
We introduce a new target, named local counting map (LCM), to obtain more accurate results than density map based approaches.
We also propose an adaptive mixture regression framework with three modules in a coarse-to-fine manner to further improve the precision of the crowd estimation.
arXiv Detail & Related papers (2020-05-12T13:54:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.