Scene-Adaptive Attention Network for Crowd Counting
- URL: http://arxiv.org/abs/2112.15509v1
- Date: Fri, 31 Dec 2021 15:03:17 GMT
- Title: Scene-Adaptive Attention Network for Crowd Counting
- Authors: Xing Wei, Yuanrui Kang, Jihao Yang, Yunfeng Qiu, Dahu Shi, Wenming
Tan, Yihong Gong
- Abstract summary: This paper proposes a scene-adaptive attention network, termed SAANet.
We design a deformable attention in-built Transformer backbone, which learns adaptive feature representations with deformable sampling locations and dynamic attention weights.
We conduct extensive experiments on four challenging crowd counting benchmarks, demonstrating that our method achieves state-of-the-art performance.
- Score: 31.29858034122248
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In recent years, significant progress has been made on the research of crowd
counting. However, as the challenging scale variations and complex scenes
existed in crowds, neither traditional convolution networks nor recent
Transformer architectures with fixed-size attention could handle the task well.
To address this problem, this paper proposes a scene-adaptive attention
network, termed SAANet. First of all, we design a deformable attention in-built
Transformer backbone, which learns adaptive feature representations with
deformable sampling locations and dynamic attention weights. Then we propose
the multi-level feature fusion and count-attentive feature enhancement modules
further, to strengthen feature representation under the global image context.
The learned representations could attend to the foreground and are adaptive to
different scales of crowds. We conduct extensive experiments on four
challenging crowd counting benchmarks, demonstrating that our method achieves
state-of-the-art performance. Especially, our method currently ranks No.1 on
the public leaderboard of the NWPU-Crowd benchmark. We hope our method could be
a strong baseline to support future research in crowd counting. The source code
will be released to the community.
Related papers
- Robust Zero-Shot Crowd Counting and Localization With Adaptive Resolution SAM [55.93697196726016]
We propose a simple yet effective crowd counting method by utilizing the Segment-Everything-Everywhere Model (SEEM)
We show that SEEM's performance in dense crowd scenes is limited, primarily due to the omission of many persons in high-density areas.
Our proposed method achieves the best unsupervised performance in crowd counting, while also being comparable to some supervised methods.
arXiv Detail & Related papers (2024-02-27T13:55:17Z) - Gramformer: Learning Crowd Counting via Graph-Modulated Transformer [68.26599222077466]
Gramformer is a graph-modulated transformer to enhance the network by adjusting the attention and input node features respectively.
A feature-based encoding is proposed to discover the centrality positions or importance of nodes.
Experiments on four challenging crowd counting datasets have validated the competitiveness of the proposed method.
arXiv Detail & Related papers (2024-01-08T13:01:54Z) - CrowdCLIP: Unsupervised Crowd Counting via Vision-Language Model [60.30099369475092]
Supervised crowd counting relies heavily on costly manual labeling.
We propose a novel unsupervised framework for crowd counting, named CrowdCLIP.
CrowdCLIP achieves superior performance compared to previous unsupervised state-of-the-art counting methods.
arXiv Detail & Related papers (2023-04-09T12:56:54Z) - Crowd counting with segmentation attention convolutional neural network [20.315829094519128]
We propose a novel convolutional neural network architecture called SegCrowdNet.
SegCrowdNet adaptively highlights the human head region and suppresses the non-head region by segmentation.
SegCrowdNet achieves excellent performance compared with the state-of-the-art methods.
arXiv Detail & Related papers (2022-04-15T08:40:38Z) - CrowdFormer: Weakly-supervised Crowd counting with Improved
Generalizability [2.8174125805742416]
We propose a weakly-supervised method for crowd counting using a pyramid vision transformer.
Our method is comparable to the state-of-the-art on the benchmark crowd datasets.
arXiv Detail & Related papers (2022-03-07T23:10:40Z) - Boosting Crowd Counting via Multifaceted Attention [109.89185492364386]
Large-scale variations often exist within crowd images.
Neither fixed-size convolution kernel of CNN nor fixed-size attention of recent vision transformers can handle this kind of variation.
We propose a Multifaceted Attention Network (MAN) to improve transformer models in local spatial relation encoding.
arXiv Detail & Related papers (2022-03-05T01:36:43Z) - Fine-grained Domain Adaptive Crowd Counting via Point-derived
Segmentation [40.17242574440061]
We propose to untangle emphdomain-invariant crowd and emphdomain-specific background from crowd images.
Specifically, to disentangle crowd from background, we propose to learn crowd segmentation from point-level crowd counting annotations.
Based on the derived segmentation, we design a crowd-aware domain adaptation mechanism consisting of two crowd-aware adaptation modules.
arXiv Detail & Related papers (2021-08-06T07:16:48Z) - Congested Crowd Instance Localization with Dilated Convolutional Swin
Transformer [119.72951028190586]
Crowd localization is a new computer vision task, evolved from crowd counting.
In this paper, we focus on how to achieve precise instance localization in high-density crowd scenes.
We propose a Dilated Convolutional Swin Transformer (DCST) for congested crowd scenes.
arXiv Detail & Related papers (2021-08-02T01:27:53Z) - Crowd Counting via Perspective-Guided Fractional-Dilation Convolution [75.36662947203192]
This paper proposes a novel convolution neural network-based crowd counting method, termed Perspective-guided Fractional-Dilation Network (PFDNet)
By modeling the continuous scale variations, the proposed PFDNet is able to select the proper fractional dilation kernels for adapting to different spatial locations.
It significantly improves the flexibility of the state-of-the-arts that only consider the discrete representative scales.
arXiv Detail & Related papers (2021-07-08T07:57:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.