Crowd Counting via Hierarchical Scale Recalibration Network
- URL: http://arxiv.org/abs/2003.03545v1
- Date: Sat, 7 Mar 2020 10:06:47 GMT
- Title: Crowd Counting via Hierarchical Scale Recalibration Network
- Authors: Zhikang Zou and Yifan Liu and Shuangjie Xu and Wei Wei and Shiping Wen
and Pan Zhou
- Abstract summary: We propose a novel Hierarchical Scale Recalibration Network (HSRNet) to tackle the task of crowd counting.
HSRNet models rich contextual dependencies and recalibrating multiple scale-associated information.
Our approach can ignore various noises selectively and focus on appropriate crowd scales automatically.
- Score: 61.09833400167511
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The task of crowd counting is extremely challenging due to complicated
difficulties, especially the huge variation in vision scale. Previous works
tend to adopt a naive concatenation of multi-scale information to tackle it,
while the scale shifts between the feature maps are ignored. In this paper, we
propose a novel Hierarchical Scale Recalibration Network (HSRNet), which
addresses the above issues by modeling rich contextual dependencies and
recalibrating multiple scale-associated information. Specifically, a Scale
Focus Module (SFM) first integrates global context into local features by
modeling the semantic inter-dependencies along channel and spatial dimensions
sequentially. In order to reallocate channel-wise feature responses, a Scale
Recalibration Module (SRM) adopts a step-by-step fusion to generate final
density maps. Furthermore, we propose a novel Scale Consistency loss to
constrain that the scale-associated outputs are coherent with groundtruth of
different scales. With the proposed modules, our approach can ignore various
noises selectively and focus on appropriate crowd scales automatically.
Extensive experiments on crowd counting datasets (ShanghaiTech, MALL,
WorldEXPO'10, and UCSD) show that our HSRNet can deliver superior results over
all state-of-the-art approaches. More remarkably, we extend experiments on an
extra vehicle dataset, whose results indicate that the proposed model is
generalized to other applications.
Related papers
- Rotated Multi-Scale Interaction Network for Referring Remote Sensing Image Segmentation [63.15257949821558]
Referring Remote Sensing Image (RRSIS) is a new challenge that combines computer vision and natural language processing.
Traditional Referring Image (RIS) approaches have been impeded by the complex spatial scales and orientations found in aerial imagery.
We introduce the Rotated Multi-Scale Interaction Network (RMSIN), an innovative approach designed for the unique demands of RRSIS.
arXiv Detail & Related papers (2023-12-19T08:14:14Z) - RGM: A Robust Generalizable Matching Model [49.60975442871967]
We propose a deep model for sparse and dense matching, termed RGM (Robust Generalist Matching)
To narrow the gap between synthetic training samples and real-world scenarios, we build a new, large-scale dataset with sparse correspondence ground truth.
We are able to mix up various dense and sparse matching datasets, significantly improving the training diversity.
arXiv Detail & Related papers (2023-10-18T07:30:08Z) - STEERER: Resolving Scale Variations for Counting and Localization via
Selective Inheritance Learning [74.2343877907438]
Scale variation is a deep-rooted problem in object counting, which has not been effectively addressed by existing scale-aware algorithms.
We propose a novel method termed STEERER that addresses the issue of scale variations in object counting.
STEERER selects the most suitable scale for patch objects to boost feature extraction and only inherits discriminative features from lower to higher resolution progressively.
arXiv Detail & Related papers (2023-08-21T05:09:07Z) - Scale Attention for Learning Deep Face Representation: A Study Against
Visual Scale Variation [69.45176408639483]
We reform the conv layer by resorting to the scale-space theory.
We build a novel style named SCale AttentioN Conv Neural Network (textbfSCAN-CNN)
As a single-shot scheme, the inference is more efficient than multi-shot fusion.
arXiv Detail & Related papers (2022-09-19T06:35:04Z) - Scale-Aware Dynamic Network for Continuous-Scale Super-Resolution [16.67263192454279]
We propose a scale-aware dynamic network (SADN) for continuous-scale SR.
First, we propose a scale-aware dynamic convolutional (SAD-Conv) layer for the feature learning of multiple SR tasks with various scales.
Second, we devise a continuous-scale upsampling module (CSUM) with the multi-bilinear local implicit function (MBLIF) for any-scale upsampling.
arXiv Detail & Related papers (2021-10-29T09:57:48Z) - Multi-View Stereo Network with attention thin volume [0.0]
We propose an efficient multi-view stereo (MVS) network for infering depth value from multiple RGB images.
We introduce the self-attention mechanism to fully aggregate the dominant information from input images.
We also introduce the group-wise correlation to feature aggregation, which greatly reduces the memory and calculation burden.
arXiv Detail & Related papers (2021-10-16T11:51:23Z) - Hybrid attention network based on progressive embedding scale-context
for crowd counting [25.866856497266884]
We propose a Hybrid Attention Network (HAN) by employing Progressive Embedding Scale-context (PES) information.
We build the hybrid attention mechanism through paralleling spatial attention and channel attention module.
PES information enables the network to simultaneously suppress noise and adapt head scale variation.
arXiv Detail & Related papers (2021-06-04T08:10:21Z) - PSCNet: Pyramidal Scale and Global Context Guided Network for Crowd
Counting [44.306790250158954]
This paper proposes a novel crowd counting approach based on pyramidal scale module (PSM) and global context module (GCM)
PSM is used to adaptively capture multi-scale information, which can identify a fine boundary of crowds with different image scales.
GCM is devised with low-complexity and lightweight manner, to make the interactive information across the channels of the feature maps more efficient.
arXiv Detail & Related papers (2020-12-07T11:35:56Z) - Joint Self-Attention and Scale-Aggregation for Self-Calibrated Deraining
Network [13.628218953897946]
In this paper, we propose an effective algorithm, called JDNet, to solve the single image deraining problem.
By designing the Scale-Aggregation and Self-Attention modules with Self-Calibrated convolution skillfully, the proposed model has better deraining results.
arXiv Detail & Related papers (2020-08-06T17:04:34Z) - Sequential Hierarchical Learning with Distribution Transformation for
Image Super-Resolution [83.70890515772456]
We build a sequential hierarchical learning super-resolution network (SHSR) for effective image SR.
We consider the inter-scale correlations of features, and devise a sequential multi-scale block (SMB) to progressively explore the hierarchical information.
Experiment results show SHSR achieves superior quantitative performance and visual quality to state-of-the-art methods.
arXiv Detail & Related papers (2020-07-19T01:35:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.