Hybrid-Attention Guided Network with Multiple Resolution Features for
Person Re-Identification
- URL: http://arxiv.org/abs/2009.07536v2
- Date: Sun, 6 Jun 2021 03:05:03 GMT
- Title: Hybrid-Attention Guided Network with Multiple Resolution Features for
Person Re-Identification
- Authors: Guoqing Zhang, Junchuan Yang, Yuhui Zheng, Yi Wu, Shengyong Chen
- Abstract summary: We present a novel person re-ID model that fuses high- and low-level embeddings to reduce the information loss caused in learning high-level features.
We also introduce the spatial and channel attention mechanisms in our model, which aims to mine more discriminative features related to the target.
- Score: 30.285126447140254
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Extracting effective and discriminative features is very important for
addressing the challenging person re-identification (re-ID) task. Prevailing
deep convolutional neural networks (CNNs) usually use high-level features for
identifying pedestrian. However, some essential spatial information resided in
low-level features such as shape, texture and color will be lost when learning
the high-level features, due to extensive padding and pooling operations in the
training stage. In addition, most existing person re-ID methods are mainly
based on hand-craft bounding boxes where images are precisely aligned. It is
unrealistic in practical applications, since the exploited object detection
algorithms often produce inaccurate bounding boxes. This will inevitably
degrade the performance of existing algorithms. To address these problems, we
put forward a novel person re-ID model that fuses high- and low-level
embeddings to reduce the information loss caused in learning high-level
features. Then we divide the fused embedding into several parts and reconnect
them to obtain the global feature and more significant local features, so as to
alleviate the affect caused by the inaccurate bounding boxes. In addition, we
also introduce the spatial and channel attention mechanisms in our model, which
aims to mine more discriminative features related to the target. Finally, we
reconstruct the feature extractor to ensure that our model can obtain more
richer and robust features. Extensive experiments display the superiority of
our approach compared with existing approaches. Our code is available at
https://github.com/libraflower/MutipleFeature-for-PRID.
Related papers
- Enhancing Fine-Grained Visual Recognition in the Low-Data Regime Through Feature Magnitude Regularization [23.78498670529746]
We introduce a regularization technique to ensure that the magnitudes of the extracted features are evenly distributed.
Despite its apparent simplicity, our approach has demonstrated significant performance improvements across various fine-grained visual recognition datasets.
arXiv Detail & Related papers (2024-09-03T07:32:46Z) - High-Order Structure Based Middle-Feature Learning for Visible-Infrared
Person Re-Identification [37.954344873390106]
Visible-infrared person re-identification (VI-ReID) aims to retrieve images of the same persons captured by visible (VIS) and infrared (IR) cameras.
Existing VI-ReID methods ignore high-order structure information of features while being relatively difficult to learn a reasonable common feature space.
We propose a novel high-order structure based middle-feature learning network (HOS-Net) for effective VI-ReID.
arXiv Detail & Related papers (2023-12-13T02:48:03Z) - Video Infringement Detection via Feature Disentanglement and Mutual
Information Maximization [51.206398602941405]
We propose to disentangle an original high-dimensional feature into multiple sub-features.
On top of the disentangled sub-features, we learn an auxiliary feature to enhance the sub-features.
Our method achieves 90.1% TOP-100 mAP on the large-scale SVD dataset and also sets the new state-of-the-art on the VCSL benchmark dataset.
arXiv Detail & Related papers (2023-09-13T10:53:12Z) - Small Object Detection via Coarse-to-fine Proposal Generation and
Imitation Learning [52.06176253457522]
We propose a two-stage framework tailored for small object detection based on the Coarse-to-fine pipeline and Feature Imitation learning.
CFINet achieves state-of-the-art performance on the large-scale small object detection benchmarks, SODA-D and SODA-A.
arXiv Detail & Related papers (2023-08-18T13:13:09Z) - HiDAnet: RGB-D Salient Object Detection via Hierarchical Depth Awareness [2.341385717236931]
We propose a novel Hierarchical Depth Awareness network (HiDAnet) for RGB-D saliency detection.
Our motivation comes from the observation that the multi-granularity properties of geometric priors correlate well with the neural network hierarchies.
Our HiDAnet performs favorably over the state-of-the-art methods by large margins.
arXiv Detail & Related papers (2023-01-18T10:00:59Z) - Learning Enriched Features for Fast Image Restoration and Enhancement [166.17296369600774]
This paper presents a holistic goal of maintaining spatially-precise high-resolution representations through the entire network.
We learn an enriched set of features that combines contextual information from multiple scales, while simultaneously preserving the high-resolution spatial details.
Our approach achieves state-of-the-art results for a variety of image processing tasks, including defocus deblurring, image denoising, super-resolution, and image enhancement.
arXiv Detail & Related papers (2022-04-19T17:59:45Z) - Efficient Person Search: An Anchor-Free Approach [86.45858994806471]
Person search aims to simultaneously localize and identify a query person from realistic, uncropped images.
To achieve this goal, state-of-the-art models typically add a re-id branch upon two-stage detectors like Faster R-CNN.
In this work, we present an anchor-free approach to efficiently tackling this challenging task, by introducing the following dedicated designs.
arXiv Detail & Related papers (2021-09-01T07:01:33Z) - Multi-attentional Deepfake Detection [79.80308897734491]
Face forgery by deepfake is widely spread over the internet and has raised severe societal concerns.
We propose a new multi-attentional deepfake detection network. Specifically, it consists of three key components: 1) multiple spatial attention heads to make the network attend to different local parts; 2) textural feature enhancement block to zoom in the subtle artifacts in shallow features; 3) aggregate the low-level textural feature and high-level semantic features guided by the attention maps.
arXiv Detail & Related papers (2021-03-03T13:56:14Z) - Accurate RGB-D Salient Object Detection via Collaborative Learning [101.82654054191443]
RGB-D saliency detection shows impressive ability on some challenge scenarios.
We propose a novel collaborative learning framework where edge, depth and saliency are leveraged in a more efficient way.
arXiv Detail & Related papers (2020-07-23T04:33:36Z) - DFNet: Discriminative feature extraction and integration network for
salient object detection [6.959742268104327]
We focus on two aspects of challenges in saliency detection using Convolutional Neural Networks.
Firstly, since salient objects appear in various sizes, using single-scale convolution would not capture the right size.
Secondly, using multi-level features helps the model use both local and global context.
arXiv Detail & Related papers (2020-04-03T13:56:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.