Person Re-identification via Attention Pyramid
- URL: http://arxiv.org/abs/2108.05340v1
- Date: Wed, 11 Aug 2021 17:33:36 GMT
- Title: Person Re-identification via Attention Pyramid
- Authors: Guangyi Chen, Tianpei Gu, Jiwen Lu, Jin-An Bao, and Jie Zhou
- Abstract summary: We propose an attention pyramid method for person re-identification.
Our attention pyramid exploits the attention regions in a multi-scale manner because human attention varies with different scales.
We evaluate our method on four largescale person re-identification benchmarks including Market-1501, DukeMTMC, CUHK03, and MSMT17.
- Score: 74.80544921378998
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we propose an attention pyramid method for person
re-identification. Unlike conventional attention-based methods which only learn
a global attention map, our attention pyramid exploits the attention regions in
a multi-scale manner because human attention varies with different scales. Our
attention pyramid imitates the process of human visual perception which tends
to notice the foreground person over the cluttered background, and further
focus on the specific color of the shirt with close observation. Specifically,
we describe our attention pyramid by a "split-attend-merge-stack" principle. We
first split the features into multiple local parts and learn the corresponding
attentions. Then, we merge local attentions and stack these merged attentions
with the residual connection as an attention pyramid. The proposed attention
pyramid is a lightweight plug-and-play module that can be applied to
off-the-shelf models. We implement our attention pyramid method in two
different attention mechanisms including channel-wise attention and spatial
attention. We evaluate our method on four largescale person re-identification
benchmarks including Market-1501, DukeMTMC, CUHK03, and MSMT17. Experimental
results demonstrate the superiority of our method, which outperforms the
state-of-the-art methods by a large margin with limited computational cost.
Related papers
- Elliptical Attention [1.7597562616011944]
Pairwise dot-product self-attention is key to the success of transformers that achieve state-of-the-art performance across a variety of applications in language and vision.
We propose using a Mahalanobis distance metric for computing the attention weights to stretch the underlying feature space in directions of high contextual relevance.
arXiv Detail & Related papers (2024-06-19T18:38:11Z) - Your "Attention" Deserves Attention: A Self-Diversified Multi-Channel
Attention for Facial Action Analysis [12.544285462327839]
We propose a compact model to enhance the representational and focusing power of neural attention maps.
The proposed method is evaluated on two benchmark databases (BP4D and DISFA) for AU detection and four databases (CK+, MMI, BU-3DFE, and BP4D+) for facial expression recognition.
It achieves superior performance compared to the state-of-the-art methods.
arXiv Detail & Related papers (2022-03-23T17:29:51Z) - Boosting Crowd Counting via Multifaceted Attention [109.89185492364386]
Large-scale variations often exist within crowd images.
Neither fixed-size convolution kernel of CNN nor fixed-size attention of recent vision transformers can handle this kind of variation.
We propose a Multifaceted Attention Network (MAN) to improve transformer models in local spatial relation encoding.
arXiv Detail & Related papers (2022-03-05T01:36:43Z) - Deep Rank-Consistent Pyramid Model for Enhanced Crowd Counting [48.15210212256114]
We propose a Deep Rank-consistEnt pyrAmid Model (DREAM), which makes full use of rank consistency across coarse-to-fine pyramid features in latent spaces for enhanced crowd counting with massive unlabeled images.
In addition, we have collected a new unlabeled crowd counting dataset, FUDAN-UCC, comprising 4,000 images for training purposes.
arXiv Detail & Related papers (2022-01-13T07:25:06Z) - Multi-Level Attention for Unsupervised Person Re-Identification [9.529435737056179]
In unsupervised person re-identification, the attention module represented by multi-headed self-attention suffers from attention spreading in the condition of non-ground truth.
We design pixel-level attention module to provide constraints for multi-headed self-attention.
For the trait that the identification targets of person re-identification data are all pedestrians, we design domain-level attention module.
arXiv Detail & Related papers (2022-01-10T02:47:06Z) - Learning to ignore: rethinking attention in CNNs [87.01305532842878]
We propose to reformulate the attention mechanism in CNNs to learn to ignore instead of learning to attend.
Specifically, we propose to explicitly learn irrelevant information in the scene and suppress it in the produced representation.
arXiv Detail & Related papers (2021-11-10T13:47:37Z) - Detection of Deepfake Videos Using Long Distance Attention [73.6659488380372]
Most existing detection methods treat the problem as a vanilla binary classification problem.
In this paper, the problem is treated as a special fine-grained classification problem since the differences between fake and real faces are very subtle.
A spatial-temporal model is proposed which has two components for capturing spatial and temporal forgery traces in global perspective.
arXiv Detail & Related papers (2021-06-24T08:33:32Z) - Rethinking of the Image Salient Object Detection: Object-level Semantic
Saliency Re-ranking First, Pixel-wise Saliency Refinement Latter [62.26677215668959]
We propose a lightweight, weakly supervised deep network to coarsely locate semantically salient regions.
We then fuse multiple off-the-shelf deep models on these semantically salient regions as the pixel-wise saliency refinement.
Our method is simple yet effective, which is the first attempt to consider the salient object detection mainly as an object-level semantic re-ranking problem.
arXiv Detail & Related papers (2020-08-10T07:12:43Z) - Telling BERT's full story: from Local Attention to Global Aggregation [14.92157586545743]
We take a deep look into the behavior of self-attention heads in the transformer architecture.
We show that attention distributions can nevertheless provide insights into the local behavior of attention heads.
arXiv Detail & Related papers (2020-04-10T01:36:41Z) - Deep Attention Aware Feature Learning for Person Re-Identification [22.107332426681072]
We propose to incorporate the attention learning as additional objectives in a person ReID network without changing the original structure.
We have tested its performance on two typical networks (TriNet and Bag of Tricks) and observed significant performance improvement on five widely used datasets.
arXiv Detail & Related papers (2020-03-01T16:27:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.