G$^2$DA: Geometry-Guided Dual-Alignment Learning for RGB-Infrared Person
Re-Identification
- URL: http://arxiv.org/abs/2106.07853v1
- Date: Tue, 15 Jun 2021 03:14:31 GMT
- Title: G$^2$DA: Geometry-Guided Dual-Alignment Learning for RGB-Infrared Person
Re-Identification
- Authors: Lin Wan, Zongyuan Sun, Qianyan Jing, Yehansen Chen, Lijing Lu, and
Zhihang Li
- Abstract summary: RGB-IR person re-identification aims to retrieve person-of-interest between heterogeneous modalities.
This paper presents a Geometry-Guided Dual-Alignment learning framework (G$2$DA) to tackle sample-level modality difference.
- Score: 3.909938091041451
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: RGB-Infrared (IR) person re-identification aims to retrieve
person-of-interest between heterogeneous modalities, suffering from large
modality discrepancy caused by different sensory devices. Existing methods
mainly focus on global-level modality alignment, whereas neglect sample-level
modality divergence to some extent, leading to performance degradation. This
paper attempts to find RGB-IR ReID solutions from tackling sample-level
modality difference, and presents a Geometry-Guided Dual-Alignment learning
framework (G$^2$DA), which jointly enhances modality-invariance and reinforces
discriminability with human topological structure in features to boost the
overall matching performance. Specifically, G$^2$DA extracts accurate body part
features with a pose estimator, serving as a semantic bridge complementing the
missing local details in global descriptor. Based on extracted local and global
features, a novel distribution constraint derived from optimal transport is
introduced to mitigate the modality gap in a fine-grained sample-level manner.
Beyond pair-wise relations across two modalities, it additionally measures the
structural similarity of different parts, thus both multi-level features and
their relations are kept consistent in the common feature space. Considering
the inherent human-topology information, we further advance a geometry-guided
graph learning module to refine each part features, where relevant regions can
be emphasized while meaningless ones are suppressed, effectively facilitating
robust feature learning. Extensive experiments on two standard benchmark
datasets validate the superiority of our proposed method, yielding competitive
performance over the state-of-the-art approaches.
Related papers
- Learning Generalizable Agents via Saliency-Guided Features Decorrelation [25.19044461705711]
We propose Saliency-Guided Features Decorrelation to eliminate correlations between features and decisions.
RFF is utilized to estimate the complex non-linear correlations in high-dimensional images, while the saliency map is designed to identify the changed features.
Under the guidance of the saliency map, SGFD employs sample reweighting to minimize the estimated correlations related to changed features.
arXiv Detail & Related papers (2023-10-08T09:24:43Z) - Disentangled Federated Learning for Tackling Attributes Skew via
Invariant Aggregation and Diversity Transferring [104.19414150171472]
Attributes skews the current federated learning (FL) frameworks from consistent optimization directions among the clients.
We propose disentangled federated learning (DFL) to disentangle the domain-specific and cross-invariant attributes into two complementary branches.
Experiments verify that DFL facilitates FL with higher performance, better interpretability, and faster convergence rate, compared with SOTA FL methods.
arXiv Detail & Related papers (2022-06-14T13:12:12Z) - Relation Matters: Foreground-aware Graph-based Relational Reasoning for
Domain Adaptive Object Detection [81.07378219410182]
We propose a new and general framework for DomainD, named Foreground-aware Graph-based Reasoning (FGRR)
FGRR incorporates graph structures into the detection pipeline to explicitly model the intra- and inter-domain foreground object relations.
Empirical results demonstrate that the proposed FGRR exceeds the state-of-the-art on four DomainD benchmarks.
arXiv Detail & Related papers (2022-06-06T05:12:48Z) - On Exploring Pose Estimation as an Auxiliary Learning Task for
Visible-Infrared Person Re-identification [66.58450185833479]
In this paper, we exploit Pose Estimation as an auxiliary learning task to assist the VI-ReID task in an end-to-end framework.
By jointly training these two tasks in a mutually beneficial manner, our model learns higher quality modality-shared and ID-related features.
Experimental results on two benchmark VI-ReID datasets show that the proposed method consistently improves state-of-the-art methods by significant margins.
arXiv Detail & Related papers (2022-01-11T09:44:00Z) - Semi-supervised Domain Adaptive Structure Learning [72.01544419893628]
Semi-supervised domain adaptation (SSDA) is a challenging problem requiring methods to overcome both 1) overfitting towards poorly annotated data and 2) distribution shift across domains.
We introduce an adaptive structure learning method to regularize the cooperation of SSL and DA.
arXiv Detail & Related papers (2021-12-12T06:11:16Z) - MSO: Multi-Feature Space Joint Optimization Network for RGB-Infrared
Person Re-Identification [35.97494894205023]
RGB-infrared cross-modality person re-identification (ReID) task aims to recognize the images of the same identity between the visible modality and the infrared modality.
Existing methods mainly use a two-stream architecture to eliminate the discrepancy between the two modalities in the final common feature space.
We present a novel multi-feature space joint optimization (MSO) network, which can learn modality-sharable features in both the single-modality space and the common space.
arXiv Detail & Related papers (2021-10-21T16:45:23Z) - DF^2AM: Dual-level Feature Fusion and Affinity Modeling for RGB-Infrared
Cross-modality Person Re-identification [18.152310122348393]
RGB-infrared person re-identification is a challenging task due to the intra-class variations and cross-modality discrepancy.
We propose a Dual-level (i.e., local and global) Feature Fusion (DF2) module by learning attention for discnative feature from local to global manner.
To further mining the relationships between global features from person images, we propose an Affinities Modeling (AM) module.
arXiv Detail & Related papers (2021-04-01T03:12:56Z) - Cross-Domain Facial Expression Recognition: A Unified Evaluation
Benchmark and Adversarial Graph Learning [85.6386289476598]
We develop a novel adversarial graph representation adaptation (AGRA) framework for cross-domain holistic-local feature co-adaptation.
We conduct extensive and fair evaluations on several popular benchmarks and show that the proposed AGRA framework outperforms previous state-of-the-art methods.
arXiv Detail & Related papers (2020-08-03T15:00:31Z) - Dynamic Dual-Attentive Aggregation Learning for Visible-Infrared Person
Re-Identification [208.1227090864602]
Visible-infrared person re-identification (VI-ReID) is a challenging cross-modality pedestrian retrieval problem.
Existing VI-ReID methods tend to learn global representations, which have limited discriminability and weak robustness to noisy images.
We propose a novel dynamic dual-attentive aggregation (DDAG) learning method by mining both intra-modality part-level and cross-modality graph-level contextual cues for VI-ReID.
arXiv Detail & Related papers (2020-07-18T03:08:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.