Frequency Domain Nuances Mining for Visible-Infrared Person
Re-identification
- URL: http://arxiv.org/abs/2401.02162v2
- Date: Wed, 10 Jan 2024 01:59:28 GMT
- Title: Frequency Domain Nuances Mining for Visible-Infrared Person
Re-identification
- Authors: Yukang Zhang, Yang Lu, Yan Yan, Hanzi Wang, Xuelong Li
- Abstract summary: Existing methods mainly exploit the spatial information while ignoring the discriminative frequency information.
We propose a novel Frequency Domain Nuances Mining (FDNM) method to explore the cross-modality frequency domain information.
Our method outperforms the second-best method by 5.2% in Rank-1 accuracy and 5.8% in mAP on the SYSU-MM01 dataset.
- Score: 75.87443138635432
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The key of visible-infrared person re-identification (VIReID) lies in how to
minimize the modality discrepancy between visible and infrared images. Existing
methods mainly exploit the spatial information while ignoring the
discriminative frequency information. To address this issue, this paper aims to
reduce the modality discrepancy from the frequency domain perspective.
Specifically, we propose a novel Frequency Domain Nuances Mining (FDNM) method
to explore the cross-modality frequency domain information, which mainly
includes an amplitude guided phase (AGP) module and an amplitude nuances mining
(ANM) module. These two modules are mutually beneficial to jointly explore
frequency domain visible-infrared nuances, thereby effectively reducing the
modality discrepancy in the frequency domain. Besides, we propose a
center-guided nuances mining loss to encourage the ANM module to preserve
discriminative identity information while discovering diverse cross-modality
nuances. Extensive experiments show that the proposed FDNM has significant
advantages in improving the performance of VIReID. Specifically, our method
outperforms the second-best method by 5.2\% in Rank-1 accuracy and 5.8\% in mAP
on the SYSU-MM01 dataset under the indoor search mode, respectively. Besides,
we also validate the effectiveness and generalization of our method on the
challenging visible-infrared face recognition task. \textcolor{magenta}{The
code will be available.}
Related papers
- Frequency Domain Modality-invariant Feature Learning for
Visible-infrared Person Re-Identification [79.9402521412239]
We propose a novel Frequency Domain modality-invariant feature learning framework (FDMNet) to reduce modality discrepancy from the frequency domain perspective.
Our framework introduces two novel modules, namely the Instance-Adaptive Amplitude Filter (IAF) and the Phrase-Preserving Normalization (PPNorm)
arXiv Detail & Related papers (2024-01-03T17:11:27Z) - Unified Frequency-Assisted Transformer Framework for Detecting and
Grounding Multi-Modal Manipulation [109.1912721224697]
We present the Unified Frequency-Assisted transFormer framework, named UFAFormer, to address the DGM4 problem.
By leveraging the discrete wavelet transform, we decompose images into several frequency sub-bands, capturing rich face forgery artifacts.
Our proposed frequency encoder, incorporating intra-band and inter-band self-attentions, explicitly aggregates forgery features within and across diverse sub-bands.
arXiv Detail & Related papers (2023-09-18T11:06:42Z) - Flare-Aware Cross-modal Enhancement Network for Multi-spectral Vehicle
Re-identification [29.48387524901101]
In harsh environments, the discnative cues in RGB and NIR modalities are often lost due to strong flares from vehicle lamps or sunlight.
We propose a Flare-Aware Cross-modal Enhancement Network that adaptively restores flare-corrupted RGB and NIR features with guidance from the flareimmunized thermal infrared spectrum.
arXiv Detail & Related papers (2023-05-23T04:04:24Z) - Visible-Infrared Person Re-Identification Using Privileged Intermediate
Information [10.816003787786766]
Cross-modal person re-identification (ReID) is challenging due to the large domain shift in data distributions between RGB and IR modalities.
This paper introduces a novel approach for a creating intermediate virtual domain that acts as bridges between the two main domains.
We devised a new method to generate images between visible and infrared domains that provide additional information to train a deep ReID model.
arXiv Detail & Related papers (2022-09-19T21:08:14Z) - CycleTrans: Learning Neutral yet Discriminative Features for
Visible-Infrared Person Re-Identification [79.84912525821255]
Visible-infrared person re-identification (VI-ReID) is a task of matching the same individuals across the visible and infrared modalities.
Existing VI-ReID methods mainly focus on learning general features across modalities, often at the expense of feature discriminability.
We present a novel cycle-construction-based network for neutral yet discriminative feature learning, termed CycleTrans.
arXiv Detail & Related papers (2022-08-21T08:41:40Z) - Spatial-Temporal Frequency Forgery Clue for Video Forgery Detection in
VIS and NIR Scenario [87.72258480670627]
Existing face forgery detection methods based on frequency domain find that the GAN forged images have obvious grid-like visual artifacts in the frequency spectrum compared to the real images.
This paper proposes a Cosine Transform-based Forgery Clue Augmentation Network (FCAN-DCT) to achieve a more comprehensive spatial-temporal feature representation.
arXiv Detail & Related papers (2022-07-05T09:27:53Z) - Adaptive Frequency Learning in Two-branch Face Forgery Detection [66.91715092251258]
We propose Adaptively learn Frequency information in the two-branch Detection framework, dubbed AFD.
We liberate our network from the fixed frequency transforms, and achieve better performance with our data- and task-dependent transform layers.
arXiv Detail & Related papers (2022-03-27T14:25:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.