Visible-Infrared Person Re-Identification Using Privileged Intermediate
Information
- URL: http://arxiv.org/abs/2209.09348v1
- Date: Mon, 19 Sep 2022 21:08:14 GMT
- Title: Visible-Infrared Person Re-Identification Using Privileged Intermediate
Information
- Authors: Mahdi Alehdaghi, Arthur Josi, Rafael M. O. Cruz and Eric Granger
- Abstract summary: Cross-modal person re-identification (ReID) is challenging due to the large domain shift in data distributions between RGB and IR modalities.
This paper introduces a novel approach for a creating intermediate virtual domain that acts as bridges between the two main domains.
We devised a new method to generate images between visible and infrared domains that provide additional information to train a deep ReID model.
- Score: 10.816003787786766
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Visible-infrared person re-identification (ReID) aims to recognize a same
person of interest across a network of RGB and IR cameras. Some deep learning
(DL) models have directly incorporated both modalities to discriminate persons
in a joint representation space. However, this cross-modal ReID problem remains
challenging due to the large domain shift in data distributions between RGB and
IR modalities. % This paper introduces a novel approach for a creating
intermediate virtual domain that acts as bridges between the two main domains
(i.e., RGB and IR modalities) during training. This intermediate domain is
considered as privileged information (PI) that is unavailable at test time, and
allows formulating this cross-modal matching task as a problem in learning
under privileged information (LUPI). We devised a new method to generate images
between visible and infrared domains that provide additional information to
train a deep ReID model through an intermediate domain adaptation. In
particular, by employing color-free and multi-step triplet loss objectives
during training, our method provides common feature representation spaces that
are robust to large visible-infrared domain shifts. % Experimental results on
challenging visible-infrared ReID datasets indicate that our proposed approach
consistently improves matching accuracy, without any computational overhead at
test time. The code is available at:
\href{https://github.com/alehdaghi/Cross-Modal-Re-ID-via-LUPI}{https://github.com/alehdaghi/Cross-Modal-Re-ID-via-LUPI}
Related papers
- Cross-Modality Perturbation Synergy Attack for Person Re-identification [66.48494594909123]
The main challenge in cross-modality ReID lies in effectively dealing with visual differences between different modalities.
Existing attack methods have primarily focused on the characteristics of the visible image modality.
This study proposes a universal perturbation attack specifically designed for cross-modality ReID.
arXiv Detail & Related papers (2024-01-18T15:56:23Z) - Frequency Domain Nuances Mining for Visible-Infrared Person
Re-identification [75.87443138635432]
Existing methods mainly exploit the spatial information while ignoring the discriminative frequency information.
We propose a novel Frequency Domain Nuances Mining (FDNM) method to explore the cross-modality frequency domain information.
Our method outperforms the second-best method by 5.2% in Rank-1 accuracy and 5.8% in mAP on the SYSU-MM01 dataset.
arXiv Detail & Related papers (2024-01-04T09:19:54Z) - Adaptive Generation of Privileged Intermediate Information for
Visible-Infrared Person Re-Identification [11.93952924941977]
This paper introduces the Adaptive Generation of Privileged Intermediate Information training approach.
AGPI2 is introduced to adapt and generate a virtual domain that bridges discriminant information between the V and I modalities.
Experimental results conducted on challenging V-I ReID indicate that AGPI2 increases matching accuracy without extra computational resources.
arXiv Detail & Related papers (2023-07-06T18:08:36Z) - CIR-Net: Cross-modality Interaction and Refinement for RGB-D Salient
Object Detection [144.66411561224507]
We present a convolutional neural network (CNN) model, named CIR-Net, based on the novel cross-modality interaction and refinement.
Our network outperforms the state-of-the-art saliency detectors both qualitatively and quantitatively.
arXiv Detail & Related papers (2022-10-06T11:59:19Z) - Dual Swin-Transformer based Mutual Interactive Network for RGB-D Salient
Object Detection [67.33924278729903]
In this work, we propose Dual Swin-Transformer based Mutual Interactive Network.
We adopt Swin-Transformer as the feature extractor for both RGB and depth modality to model the long-range dependencies in visual inputs.
Comprehensive experiments on five standard RGB-D SOD benchmark datasets demonstrate the superiority of the proposed DTMINet method.
arXiv Detail & Related papers (2022-06-07T08:35:41Z) - Towards Homogeneous Modality Learning and Multi-Granularity Information
Exploration for Visible-Infrared Person Re-Identification [16.22986967958162]
Visible-infrared person re-identification (VI-ReID) is a challenging and essential task, which aims to retrieve a set of person images over visible and infrared camera views.
Previous methods attempt to apply generative adversarial network (GAN) to generate the modality-consisitent data.
In this work, we address cross-modality matching problem with Aligned Grayscale Modality (AGM), an unified dark-line spectrum that reformulates visible-infrared dual-mode learning as a gray-gray single-mode learning problem.
arXiv Detail & Related papers (2022-04-11T03:03:19Z) - CMTR: Cross-modality Transformer for Visible-infrared Person
Re-identification [38.96033760300123]
Cross-modality transformer-based method (CMTR) for visible-infrared person re-identification task.
We design the novel modality embeddings, which are fused with token embeddings to encode modalities' information.
Our proposed CMTR model's performance significantly surpasses existing outstanding CNN-based methods.
arXiv Detail & Related papers (2021-10-18T03:12:59Z) - Self-Supervised Representation Learning for RGB-D Salient Object
Detection [93.17479956795862]
We use Self-Supervised Representation Learning to design two pretext tasks: the cross-modal auto-encoder and the depth-contour estimation.
Our pretext tasks require only a few and un RGB-D datasets to perform pre-training, which make the network capture rich semantic contexts.
For the inherent problem of cross-modal fusion in RGB-D SOD, we propose a multi-path fusion module.
arXiv Detail & Related papers (2021-01-29T09:16:06Z) - Bi-directional Cross-Modality Feature Propagation with
Separation-and-Aggregation Gate for RGB-D Semantic Segmentation [59.94819184452694]
Depth information has proven to be a useful cue in the semantic segmentation of RGBD images for providing a geometric counterpart to the RGB representation.
Most existing works simply assume that depth measurements are accurate and well-aligned with the RGB pixels and models the problem as a cross-modal feature fusion.
In this paper, we propose a unified and efficient Crossmodality Guided to not only effectively recalibrate RGB feature responses, but also to distill accurate depth information via multiple stages and aggregate the two recalibrated representations alternatively.
arXiv Detail & Related papers (2020-07-17T18:35:24Z) - Cross-Spectrum Dual-Subspace Pairing for RGB-infrared Cross-Modality
Person Re-Identification [15.475897856494583]
Conventional person re-identification can only handle RGB color images, which will fail at dark conditions.
RGB-infrared ReID (also known as Infrared-Visible ReID or Visible-Thermal ReID) is proposed.
In this paper, a novel multi-spectrum image generation method is proposed and the generated samples are utilized to help the network to find discriminative information.
arXiv Detail & Related papers (2020-02-29T09:01:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.