JHU-CROWD++: Large-Scale Crowd Counting Dataset and A Benchmark Method
- URL: http://arxiv.org/abs/2004.03597v2
- Date: Mon, 2 Nov 2020 17:52:27 GMT
- Title: JHU-CROWD++: Large-Scale Crowd Counting Dataset and A Benchmark Method
- Authors: Vishwanath A. Sindagi, Rajeev Yasarla, Vishal M. Patel
- Abstract summary: We introduce a new large scale unconstrained crowd counting dataset (JHU-CROWD++) that contains "4,372" images with "1.51 million" annotations.
We propose a novel crowd counting network that progressively generates crowd density maps via residual error estimation.
- Score: 92.15895515035795
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Due to its variety of applications in the real-world, the task of single
image-based crowd counting has received a lot of interest in the recent years.
Recently, several approaches have been proposed to address various problems
encountered in crowd counting. These approaches are essentially based on
convolutional neural networks that require large amounts of data to train the
network parameters. Considering this, we introduce a new large scale
unconstrained crowd counting dataset (JHU-CROWD++) that contains "4,372" images
with "1.51 million" annotations. In comparison to existing datasets, the
proposed dataset is collected under a variety of diverse scenarios and
environmental conditions. Specifically, the dataset includes several images
with weather-based degradations and illumination variations, making it a very
challenging dataset. Additionally, the dataset consists of a rich set of
annotations at both image-level and head-level. Several recent methods are
evaluated and compared on this dataset. The dataset can be downloaded from
http://www.crowd-counting.com .
Furthermore, we propose a novel crowd counting network that progressively
generates crowd density maps via residual error estimation. The proposed method
uses VGG16 as the backbone network and employs density map generated by the
final layer as a coarse prediction to refine and generate finer density maps in
a progressive fashion using residual learning. Additionally, the residual
learning is guided by an uncertainty-based confidence weighting mechanism that
permits the flow of only high-confidence residuals in the refinement path. The
proposed Confidence Guided Deep Residual Counting Network (CG-DRCN) is
evaluated on recent complex datasets, and it achieves significant improvements
in errors.
Related papers
- An evaluation of Deep Learning based stereo dense matching dataset shift
from aerial images and a large scale stereo dataset [2.048226951354646]
We present a method for generating ground-truth disparity maps directly from Light Detection and Ranging (LiDAR) and images.
We evaluate 11 dense matching methods across datasets with diverse scene types, image resolutions, and geometric configurations.
arXiv Detail & Related papers (2024-02-19T20:33:46Z) - Diffusion-based Data Augmentation for Object Counting Problems [62.63346162144445]
We develop a pipeline that utilizes a diffusion model to generate extensive training data.
We are the first to generate images conditioned on a location dot map with a diffusion model.
Our proposed counting loss for the diffusion model effectively minimizes the discrepancies between the location dot map and the crowd images generated.
arXiv Detail & Related papers (2024-01-25T07:28:22Z) - Cascaded Residual Density Network for Crowd Counting [63.714719914701014]
We propose a novel Cascaded Residual Density Network (CRDNet) in a coarse-to-fine approach to generate the high-quality density map for crowd counting more accurately.
A novel additional local count loss is presented to refine the accuracy of crowd counting.
arXiv Detail & Related papers (2021-07-29T03:07:11Z) - PSCNet: Pyramidal Scale and Global Context Guided Network for Crowd
Counting [44.306790250158954]
This paper proposes a novel crowd counting approach based on pyramidal scale module (PSM) and global context module (GCM)
PSM is used to adaptively capture multi-scale information, which can identify a fine boundary of crowds with different image scales.
GCM is devised with low-complexity and lightweight manner, to make the interactive information across the channels of the feature maps more efficient.
arXiv Detail & Related papers (2020-12-07T11:35:56Z) - Counting from Sky: A Large-scale Dataset for Remote Sensing Object
Counting and A Benchmark Method [52.182698295053264]
We are interested in counting dense objects from remote sensing images. Compared with object counting in a natural scene, this task is challenging in the following factors: large scale variation, complex cluttered background, and orientation arbitrariness.
To address these issues, we first construct a large-scale object counting dataset with remote sensing images, which contains four important geographic objects.
We then benchmark the dataset by designing a novel neural network that can generate a density map of an input image.
arXiv Detail & Related papers (2020-08-28T03:47:49Z) - Bayesian Multi Scale Neural Network for Crowd Counting [0.0]
We propose a new network which uses a ResNet based feature extractor, downsampling block which uses dilated convolutions and upsampling block using transposed convolutions.
We present a novel aggregation module which makes our network robust to the perspective view problem.
arXiv Detail & Related papers (2020-07-11T21:43:20Z) - Diversity inducing Information Bottleneck in Model Ensembles [73.80615604822435]
In this paper, we target the problem of generating effective ensembles of neural networks by encouraging diversity in prediction.
We explicitly optimize a diversity inducing adversarial loss for learning latent variables and thereby obtain diversity in the output predictions necessary for modeling multi-modal data.
Compared to the most competitive baselines, we show significant improvements in classification accuracy, under a shift in the data distribution.
arXiv Detail & Related papers (2020-03-10T03:10:41Z) - NWPU-Crowd: A Large-Scale Benchmark for Crowd Counting and Localization [101.13851473792334]
We construct a large-scale congested crowd counting and localization dataset, NWPU-Crowd, consisting of 5,109 images, in a total of 2,133,375 annotated heads with points and boxes.
Compared with other real-world datasets, it contains various illumination scenes and has the largest density range (020,033)
We describe the data characteristics, evaluate the performance of some mainstream state-of-the-art (SOTA) methods, and analyze the new problems that arise on the new data.
arXiv Detail & Related papers (2020-01-10T09:26:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.