Related papers: Adaptive Generation of Privileged Intermediate Information for Visible-Infrared Person Re-Identification

Adaptive Generation of Privileged Intermediate Information for Visible-Infrared Person Re-Identification

URL: http://arxiv.org/abs/2307.03240v2
Date: Mon, 10 Feb 2025 05:25:37 GMT
Title: Adaptive Generation of Privileged Intermediate Information for Visible-Infrared Person Re-Identification
Authors: Mahdi Alehdaghi, Arthur Josi, Pourya Shamsolmoali, Rafael M. O. Cruz, Eric Granger,
Abstract summary: This paper introduces the Adaptive Generation of Privileged Intermediate Information training approach. AGPI2 is introduced to adapt and generate a virtual domain that bridges discriminant information between the V and I modalities. Experimental results conducted on challenging V-I ReID indicate that AGPI2 increases matching accuracy without extra computational resources.
Score: 12.19832994183062
License:
Abstract: Visible-infrared person re-identification seeks to retrieve images of the same individual captured over a distributed network of RGB and IR sensors. Several V-I ReID approaches directly integrate both V and I modalities to discriminate persons within a shared representation space. However, given the significant gap in data distributions between V and I modalities, cross-modal V-I ReID remains challenging. Some recent approaches improve generalization by leveraging intermediate spaces that can bridge V and I modalities, yet effective methods are required to select or generate data for such informative domains. In this paper, the Adaptive Generation of Privileged Intermediate Information training approach is introduced to adapt and generate a virtual domain that bridges discriminant information between the V and I modalities. The key motivation behind AGPI^2 is to enhance the training of a deep V-I ReID backbone by generating privileged images that provide additional information. These privileged images capture shared discriminative features that are not easily accessible within the original V or I modalities alone. Towards this goal, a non-linear generative module is trained with an adversarial objective, translating V images into intermediate spaces with a smaller domain shift w.r.t. the I domain. Meanwhile, the embedding module within AGPI^2 aims to produce similar features for both V and generated images, encouraging the extraction of features that are common to all modalities. In addition to these contributions, AGPI^2 employs adversarial objectives for adapting the intermediate images, which play a crucial role in creating a non-modality-specific space to address the large domain shifts between V and I domains. Experimental results conducted on challenging V-I ReID datasets indicate that AGPI^2 increases matching accuracy without extra computational resources during inference.

Related papers

Exploiting Aggregation and Segregation of Representations for Domain Adaptive Human Pose Estimation [50.31351006532924]
Human pose estimation (HPE) has received increasing attention recently due to its wide application in motion analysis, virtual reality, healthcare, etc. It suffers from the lack of labeled diverse real-world datasets due to the time- and labor-intensive annotation. We introduce a novel framework that capitalizes on both representation aggregation and segregation for domain adaptive human pose estimation.
arXiv Detail & Related papers (2024-12-29T17:59:45Z)
Unity in Diversity: Multi-expert Knowledge Confrontation and Collaboration for Generalizable Vehicle Re-identification [60.20318058777603]
Generalizable vehicle re-identification (ReID) seeks to develop models that can adapt to unknown target domains without the need for fine-tuning or retraining. Previous works have mainly focused on extracting domain-invariant features by aligning data distributions between source domains. We propose a two-stage Multi-expert Knowledge Confrontation and Collaboration (MiKeCoCo) method to solve this unique problem.
arXiv Detail & Related papers (2024-07-10T04:06:39Z)
Bidirectional Multi-Step Domain Generalization for Visible-Infrared Person Re-Identification [12.14946364107671]
A key challenge in visible-infrared person re-identification (V-I ReID) is training a backbone model capable of effectively addressing the significant discrepancies across modalities. This paper introduces Bidirectional Multi-step Domain Generalization, a novel approach for unifying feature representations across diverse modalities. Experiments conducted on V-I ReID datasets indicate that our BMDG approach can outperform state-of-the-art part-based and intermediate generation methods.
arXiv Detail & Related papers (2024-03-16T03:03:27Z)
VDNA-PR: Using General Dataset Representations for Robust Sequential Visual Place Recognition [17.393105901701098]
This paper adapts a general dataset representation technique to produce robust Visual Place Recognition (VPR) descriptors. Our experiments show that our representation can allow for better robustness than current solutions to serious domain shifts away from the training data distribution.
arXiv Detail & Related papers (2024-03-14T01:30:28Z)
PiPa: Pixel- and Patch-wise Self-supervised Learning for Domain Adaptative Semantic Segmentation [100.6343963798169]
Unsupervised Domain Adaptation (UDA) aims to enhance the generalization of the learned model to other domains. We propose a unified pixel- and patch-wise self-supervised learning framework, called PiPa, for domain adaptive semantic segmentation.
arXiv Detail & Related papers (2022-11-14T18:31:24Z)
Progressive Transformation Learning for Leveraging Virtual Images in Training [21.590496842692744]
We introduce Progressive Transformation Learning (PTL) to augment a training dataset by adding transformed virtual images with enhanced realism. PTL takes a novel approach that progressively iterates the following three steps: 1) select a subset from a pool of virtual images according to the domain gap, 2) transform the selected virtual images to enhance realism, and 3) add the transformed virtual images to the training set while removing them from the pool. Experiments show that PTL results in a substantial performance increase over the baseline, especially in the small data and the cross-domain regime.
arXiv Detail & Related papers (2022-11-03T13:04:15Z)
Visible-Infrared Person Re-Identification Using Privileged Intermediate Information [10.816003787786766]
Cross-modal person re-identification (ReID) is challenging due to the large domain shift in data distributions between RGB and IR modalities. This paper introduces a novel approach for a creating intermediate virtual domain that acts as bridges between the two main domains. We devised a new method to generate images between visible and infrared domains that provide additional information to train a deep ReID model.
arXiv Detail & Related papers (2022-09-19T21:08:14Z)
Unsupervised domain adaptation semantic segmentation of high-resolution remote sensing imagery with invariant domain-level context memory [10.210120085157161]
This study proposes a novel unsupervised domain adaptation semantic segmentation network (MemoryAdaptNet) for the semantic segmentation of HRS imagery. MemoryAdaptNet constructs an output space adversarial learning scheme to bridge the domain distribution discrepancy between source domain and target domain. Experiments under three cross-domain tasks indicate that our proposed MemoryAdaptNet is remarkably superior to the state-of-the-art methods.
arXiv Detail & Related papers (2022-08-16T12:35:57Z)
Stagewise Unsupervised Domain Adaptation with Adversarial Self-Training for Road Segmentation of Remote Sensing Images [93.50240389540252]
Road segmentation from remote sensing images is a challenging task with wide ranges of application potentials. We propose a novel stagewise domain adaptation model called RoadDA to address the domain shift (DS) issue in this field. Experiment results on two benchmarks demonstrate that RoadDA can efficiently reduce the domain gap and outperforms state-of-the-art methods.
arXiv Detail & Related papers (2021-08-28T09:29:14Z)
Learning Cross-modal Contrastive Features for Video Domain Adaptation [138.75196499580804]
We propose a unified framework for video domain adaptation, which simultaneously regularizes cross-modal and cross-domain feature representations. Specifically, we treat each modality in a domain as a view and leverage the contrastive learning technique with properly designed sampling strategies.
arXiv Detail & Related papers (2021-08-26T18:14:18Z)
AFAN: Augmented Feature Alignment Network for Cross-Domain Object Detection [90.18752912204778]
Unsupervised domain adaptation for object detection is a challenging problem with many real-world applications. We propose a novel augmented feature alignment network (AFAN) which integrates intermediate domain image generation and domain-adversarial training. Our approach significantly outperforms the state-of-the-art methods on standard benchmarks for both similar and dissimilar domain adaptations.
arXiv Detail & Related papers (2021-06-10T05:01:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.