High-Order Structure Based Middle-Feature Learning for Visible-Infrared
Person Re-Identification
- URL: http://arxiv.org/abs/2312.07853v2
- Date: Thu, 14 Dec 2023 02:05:03 GMT
- Title: High-Order Structure Based Middle-Feature Learning for Visible-Infrared
Person Re-Identification
- Authors: Liuxiang Qiu, Si Chen, Yan Yan, Jing-Hao Xue, Da-Han Wang, Shunzhi Zhu
- Abstract summary: Visible-infrared person re-identification (VI-ReID) aims to retrieve images of the same persons captured by visible (VIS) and infrared (IR) cameras.
Existing VI-ReID methods ignore high-order structure information of features while being relatively difficult to learn a reasonable common feature space.
We propose a novel high-order structure based middle-feature learning network (HOS-Net) for effective VI-ReID.
- Score: 37.954344873390106
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Visible-infrared person re-identification (VI-ReID) aims to retrieve images
of the same persons captured by visible (VIS) and infrared (IR) cameras.
Existing VI-ReID methods ignore high-order structure information of features
while being relatively difficult to learn a reasonable common feature space due
to the large modality discrepancy between VIS and IR images. To address the
above problems, we propose a novel high-order structure based middle-feature
learning network (HOS-Net) for effective VI-ReID. Specifically, we first
leverage a short- and long-range feature extraction (SLE) module to effectively
exploit both short-range and long-range features. Then, we propose a high-order
structure learning (HSL) module to successfully model the high-order
relationship across different local features of each person image based on a
whitened hypergraph network.This greatly alleviates model collapse and enhances
feature representations. Finally, we develop a common feature space learning
(CFL) module to learn a discriminative and reasonable common feature space
based on middle features generated by aligning features from different
modalities and ranges. In particular, a modality-range identity-center
contrastive (MRIC) loss is proposed to reduce the distances between the VIS,
IR, and middle features, smoothing the training process. Extensive experiments
on the SYSU-MM01, RegDB, and LLCM datasets show that our HOS-Net achieves
superior state-of-the-art performance. Our code is available at
\url{https://github.com/Jaulaucoeng/HOS-Net}.
Related papers
- Multi-Scale Direction-Aware Network for Infrared Small Target Detection [2.661766509317245]
Infrared small target detection faces the problem that it is difficult to effectively separate the background and the target.
We propose a multi-scale direction-aware network (MSDA-Net) to integrate the high-frequency directional features of infrared small targets.
MSDA-Net achieves state-of-the-art (SOTA) results on the public NUDT-SIRST, SIRST and IRSTD-1k datasets.
arXiv Detail & Related papers (2024-06-04T07:23:09Z) - HiDAnet: RGB-D Salient Object Detection via Hierarchical Depth Awareness [2.341385717236931]
We propose a novel Hierarchical Depth Awareness network (HiDAnet) for RGB-D saliency detection.
Our motivation comes from the observation that the multi-granularity properties of geometric priors correlate well with the neural network hierarchies.
Our HiDAnet performs favorably over the state-of-the-art methods by large margins.
arXiv Detail & Related papers (2023-01-18T10:00:59Z) - Multi-level Second-order Few-shot Learning [111.0648869396828]
We propose a Multi-level Second-order (MlSo) few-shot learning network for supervised or unsupervised few-shot image classification and few-shot action recognition.
We leverage so-called power-normalized second-order base learner streams combined with features that express multiple levels of visual abstraction.
We demonstrate respectable results on standard datasets such as Omniglot, mini-ImageNet, tiered-ImageNet, Open MIC, fine-grained datasets such as CUB Birds, Stanford Dogs and Cars, and action recognition datasets such as HMDB51, UCF101, and mini-MIT.
arXiv Detail & Related papers (2022-01-15T19:49:00Z) - On Exploring Pose Estimation as an Auxiliary Learning Task for
Visible-Infrared Person Re-identification [66.58450185833479]
In this paper, we exploit Pose Estimation as an auxiliary learning task to assist the VI-ReID task in an end-to-end framework.
By jointly training these two tasks in a mutually beneficial manner, our model learns higher quality modality-shared and ID-related features.
Experimental results on two benchmark VI-ReID datasets show that the proposed method consistently improves state-of-the-art methods by significant margins.
arXiv Detail & Related papers (2022-01-11T09:44:00Z) - Neural Feature Search for RGB-Infrared Person Re-Identification [3.499870393443268]
We study a general paradigm, termed Neural Feature Search (NFS), to automate the process of feature selection.
NFS combines a dual-level feature search space and a differentiable search strategy to jointly select identity-related cues in coarse-grained channels and fine-grained spatial pixels.
Our method outperforms state-of-the-arts on mainstream benchmarks.
arXiv Detail & Related papers (2021-04-06T08:40:44Z) - High-resolution Depth Maps Imaging via Attention-based Hierarchical
Multi-modal Fusion [84.24973877109181]
We propose a novel attention-based hierarchical multi-modal fusion network for guided DSR.
We show that our approach outperforms state-of-the-art methods in terms of reconstruction accuracy, running speed and memory efficiency.
arXiv Detail & Related papers (2021-04-04T03:28:33Z) - SFANet: A Spectrum-aware Feature Augmentation Network for
Visible-Infrared Person Re-Identification [12.566284647658053]
We propose a novel spectrum-aware feature augementation network named SFANet for cross-modality matching problem.
Learning with grayscale-spectrum images, our model can apparently reduce modality discrepancy and detect inner structure relations.
In feature-level, we improve the conventional two-stream network through balancing the number of specific and sharable convolutional blocks.
arXiv Detail & Related papers (2021-02-24T08:57:32Z) - Hybrid-Attention Guided Network with Multiple Resolution Features for
Person Re-Identification [30.285126447140254]
We present a novel person re-ID model that fuses high- and low-level embeddings to reduce the information loss caused in learning high-level features.
We also introduce the spatial and channel attention mechanisms in our model, which aims to mine more discriminative features related to the target.
arXiv Detail & Related papers (2020-09-16T08:12:42Z) - Sequential Hierarchical Learning with Distribution Transformation for
Image Super-Resolution [83.70890515772456]
We build a sequential hierarchical learning super-resolution network (SHSR) for effective image SR.
We consider the inter-scale correlations of features, and devise a sequential multi-scale block (SMB) to progressively explore the hierarchical information.
Experiment results show SHSR achieves superior quantitative performance and visual quality to state-of-the-art methods.
arXiv Detail & Related papers (2020-07-19T01:35:53Z) - Weakly Supervised Attention Pyramid Convolutional Neural Network for
Fine-Grained Visual Classification [71.96618723152487]
We introduce Attention Pyramid Convolutional Neural Network (AP-CNN)
AP-CNN learns both high-level semantic and low-level detailed feature representation.
It can be trained end-to-end, without the need of additional bounding box/part annotations.
arXiv Detail & Related papers (2020-02-09T12:33:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.