Multi-Level Attentive Convoluntional Neural Network for Crowd Counting
- URL: http://arxiv.org/abs/2105.11422v1
- Date: Mon, 24 May 2021 17:29:00 GMT
- Title: Multi-Level Attentive Convoluntional Neural Network for Crowd Counting
- Authors: Mengxiao Tian, Hao Guo, Chengjiang Long
- Abstract summary: We propose a multi-level attentive Convolutional Neural Network (MLAttnCNN) for crowd counting.
We extract high-level contextual information with multiple different scales applied in pooling.
We use multi-level attention modules to enrich the characteristics at different layers to achieve more efficient multi-scale feature fusion.
- Score: 12.61997540961144
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recently the crowd counting has received more and more attention. Especially
the technology of high-density environment has become an important research
content, and the relevant methods for the existence of extremely dense crowd
are not optimal. In this paper, we propose a multi-level attentive
Convolutional Neural Network (MLAttnCNN) for crowd counting. We extract
high-level contextual information with multiple different scales applied in
pooling, and use multi-level attention modules to enrich the characteristics at
different layers to achieve more efficient multi-scale feature fusion, which is
able to be used to generate a more accurate density map with dilated
convolutions and a $1\times 1$ convolution. The extensive experiments on three
available public datasets show that our proposed network achieves
outperformance to the state-of-the-art approaches.
Related papers
- Scalable Multi-view Clustering via Explicit Kernel Features Maps [20.610589722626074]
A growing awareness of multi-view learning is a consequence of the increasing prevalence of multiple views in real-world applications.
An efficient optimization strategy is proposed, leveraging kernel feature maps to reduce the computational burden while maintaining good clustering performance.
We conduct extensive experiments on real-world benchmark networks of various sizes in order to evaluate the performance of our algorithm against state-of-the-art multi-view subspace clustering methods and attributed-network multi-view approaches.
arXiv Detail & Related papers (2024-02-07T12:35:31Z) - HiDAnet: RGB-D Salient Object Detection via Hierarchical Depth Awareness [2.341385717236931]
We propose a novel Hierarchical Depth Awareness network (HiDAnet) for RGB-D saliency detection.
Our motivation comes from the observation that the multi-granularity properties of geometric priors correlate well with the neural network hierarchies.
Our HiDAnet performs favorably over the state-of-the-art methods by large margins.
arXiv Detail & Related papers (2023-01-18T10:00:59Z) - Redesigning Multi-Scale Neural Network for Crowd Counting [68.674652984003]
We introduce a hierarchical mixture of density experts, which hierarchically merges multi-scale density maps for crowd counting.
Within the hierarchical structure, an expert competition and collaboration scheme is presented to encourage contributions from all scales.
Experiments show that our method achieves the state-of-the-art performance on five public datasets.
arXiv Detail & Related papers (2022-08-04T21:49:29Z) - Crowd counting with segmentation attention convolutional neural network [20.315829094519128]
We propose a novel convolutional neural network architecture called SegCrowdNet.
SegCrowdNet adaptively highlights the human head region and suppresses the non-head region by segmentation.
SegCrowdNet achieves excellent performance compared with the state-of-the-art methods.
arXiv Detail & Related papers (2022-04-15T08:40:38Z) - Routing with Self-Attention for Multimodal Capsule Networks [108.85007719132618]
We present a new multimodal capsule network that allows us to leverage the strength of capsules in the context of a multimodal learning framework.
To adapt the capsules to large-scale input data, we propose a novel routing by self-attention mechanism that selects relevant capsules.
This allows not only for robust training with noisy video data, but also to scale up the size of the capsule network compared to traditional routing methods.
arXiv Detail & Related papers (2021-12-01T19:01:26Z) - Multi-scale Matching Networks for Semantic Correspondence [38.904735120815346]
The proposed method achieves state-of-the-art performance on three popular benchmarks with high computational efficiency.
Our multi-scale matching network can be trained end-to-end easily with few additional learnable parameters.
arXiv Detail & Related papers (2021-07-31T10:57:24Z) - Multi-Scale Context Aggregation Network with Attention-Guided for Crowd
Counting [23.336181341124746]
Crowd counting aims to predict the number of people and generate the density map in the image.
There are many challenges, including varying head scales, the diversity of crowd distribution across images and cluttered backgrounds.
We propose a multi-scale context aggregation network (MSCANet) based on single-column encoder-decoder architecture for crowd counting.
arXiv Detail & Related papers (2021-04-06T02:24:06Z) - Cross-Modal Collaborative Representation Learning and a Large-Scale RGBT
Benchmark for Crowd Counting [109.32927895352685]
We introduce a large-scale RGBT Crowd Counting (RGBT-CC) benchmark, which contains 2,030 pairs of RGB-thermal images with 138,389 annotated people.
To facilitate the multimodal crowd counting, we propose a cross-modal collaborative representation learning framework.
Experiments conducted on the RGBT-CC benchmark demonstrate the effectiveness of our framework for RGBT crowd counting.
arXiv Detail & Related papers (2020-12-08T16:18:29Z) - Recursive Multi-model Complementary Deep Fusion forRobust Salient Object
Detection via Parallel Sub Networks [62.26677215668959]
Fully convolutional networks have shown outstanding performance in the salient object detection (SOD) field.
This paper proposes a wider'' network architecture which consists of parallel sub networks with totally different network architectures.
Experiments on several famous benchmarks clearly demonstrate the superior performance, good generalization, and powerful learning ability of the proposed wider framework.
arXiv Detail & Related papers (2020-08-07T10:39:11Z) - Shallow Feature Based Dense Attention Network for Crowd Counting [103.67446852449551]
We propose a Shallow feature based Dense Attention Network (SDANet) for crowd counting from still images.
Our method outperforms other existing methods by a large margin, as is evident from a remarkable 11.9% Mean Absolute Error (MAE) drop of our SDANet.
arXiv Detail & Related papers (2020-06-17T13:34:42Z) - Unpaired Multi-modal Segmentation via Knowledge Distillation [77.39798870702174]
We propose a novel learning scheme for unpaired cross-modality image segmentation.
In our method, we heavily reuse network parameters, by sharing all convolutional kernels across CT and MRI.
We have extensively validated our approach on two multi-class segmentation problems.
arXiv Detail & Related papers (2020-01-06T20:03:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.