Redesigning Multi-Scale Neural Network for Crowd Counting
- URL: http://arxiv.org/abs/2208.02894v2
- Date: Tue, 4 Jul 2023 01:55:13 GMT
- Title: Redesigning Multi-Scale Neural Network for Crowd Counting
- Authors: Zhipeng Du, Miaojing Shi, Jiankang Deng, Stefanos Zafeiriou
- Abstract summary: We introduce a hierarchical mixture of density experts, which hierarchically merges multi-scale density maps for crowd counting.
Within the hierarchical structure, an expert competition and collaboration scheme is presented to encourage contributions from all scales.
Experiments show that our method achieves the state-of-the-art performance on five public datasets.
- Score: 68.674652984003
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Perspective distortions and crowd variations make crowd counting a
challenging task in computer vision. To tackle it, many previous works have
used multi-scale architecture in deep neural networks (DNNs). Multi-scale
branches can be either directly merged (e.g. by concatenation) or merged
through the guidance of proxies (e.g. attentions) in the DNNs. Despite their
prevalence, these combination methods are not sophisticated enough to deal with
the per-pixel performance discrepancy over multi-scale density maps. In this
work, we redesign the multi-scale neural network by introducing a hierarchical
mixture of density experts, which hierarchically merges multi-scale density
maps for crowd counting. Within the hierarchical structure, an expert
competition and collaboration scheme is presented to encourage contributions
from all scales; pixel-wise soft gating nets are introduced to provide
pixel-wise soft weights for scale combinations in different hierarchies. The
network is optimized using both the crowd density map and the local counting
map, where the latter is obtained by local integration on the former.
Optimizing both can be problematic because of their potential conflicts. We
introduce a new relative local counting loss based on relative count
differences among hard-predicted local regions in an image, which proves to be
complementary to the conventional absolute error loss on the density map.
Experiments show that our method achieves the state-of-the-art performance on
five public datasets, i.e. ShanghaiTech, UCF_CC_50, JHU-CROWD++, NWPU-Crowd and
Trancos.
Related papers
- Diffusion-based Data Augmentation for Object Counting Problems [62.63346162144445]
We develop a pipeline that utilizes a diffusion model to generate extensive training data.
We are the first to generate images conditioned on a location dot map with a diffusion model.
Our proposed counting loss for the diffusion model effectively minimizes the discrepancies between the location dot map and the crowd images generated.
arXiv Detail & Related papers (2024-01-25T07:28:22Z) - HDNet: A Hierarchically Decoupled Network for Crowd Counting [11.530565995318696]
We propose a Hierarchically Decoupled Network (HDNet) to solve the above two problems within a unified framework.
HDNet achieves state-of-the-art performance on several popular counting benchmarks.
arXiv Detail & Related papers (2022-12-12T06:01:26Z) - Cascaded Residual Density Network for Crowd Counting [63.714719914701014]
We propose a novel Cascaded Residual Density Network (CRDNet) in a coarse-to-fine approach to generate the high-quality density map for crowd counting more accurately.
A novel additional local count loss is presented to refine the accuracy of crowd counting.
arXiv Detail & Related papers (2021-07-29T03:07:11Z) - BaMBNet: A Blur-aware Multi-branch Network for Defocus Deblurring [74.34263243089688]
convolutional neural networks (CNNs) have been introduced to the defocus deblurring problem and achieved significant progress.
This study designs a novel blur-aware multi-branch network (BaMBNet) in which different regions (with different blur amounts) should be treated differentially.
Both quantitative and qualitative experiments demonstrate that our BaMBNet outperforms the state-of-the-art methods.
arXiv Detail & Related papers (2021-05-31T07:55:30Z) - Bayesian Multi Scale Neural Network for Crowd Counting [0.0]
We propose a new network which uses a ResNet based feature extractor, downsampling block which uses dilated convolutions and upsampling block using transposed convolutions.
We present a novel aggregation module which makes our network robust to the perspective view problem.
arXiv Detail & Related papers (2020-07-11T21:43:20Z) - JHU-CROWD++: Large-Scale Crowd Counting Dataset and A Benchmark Method [92.15895515035795]
We introduce a new large scale unconstrained crowd counting dataset (JHU-CROWD++) that contains "4,372" images with "1.51 million" annotations.
We propose a novel crowd counting network that progressively generates crowd density maps via residual error estimation.
arXiv Detail & Related papers (2020-04-07T14:59:35Z) - Image Fine-grained Inpainting [89.17316318927621]
We present a one-stage model that utilizes dense combinations of dilated convolutions to obtain larger and more effective receptive fields.
To better train this efficient generator, except for frequently-used VGG feature matching loss, we design a novel self-guided regression loss.
We also employ a discriminator with local and global branches to ensure local-global contents consistency.
arXiv Detail & Related papers (2020-02-07T03:45:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.