Multimodal Data Augmentation for Visual-Infrared Person ReID with
Corrupted Data
- URL: http://arxiv.org/abs/2211.11925v1
- Date: Tue, 22 Nov 2022 00:29:55 GMT
- Title: Multimodal Data Augmentation for Visual-Infrared Person ReID with
Corrupted Data
- Authors: Arthur Josi, Mahdi Alehdaghi, Rafael M. O. Cruz, Eric Granger
- Abstract summary: We propose a specialized DA strategy for V-I person ReID models.
Our strategy allows to diminish the impact of corruption on the accuracy of deep person ReID models.
Results indicate that using our strategy, V-I ReID models can exploit both shared and individual modality knowledge.
- Score: 10.816003787786766
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The re-identification (ReID) of individuals over a complex network of cameras
is a challenging task, especially under real-world surveillance conditions.
Several deep learning models have been proposed for visible-infrared (V-I)
person ReID to recognize individuals from images captured using RGB and IR
cameras. However, performance may decline considerably if RGB and IR images
captured at test time are corrupted (e.g., noise, blur, and weather
conditions). Although various data augmentation (DA) methods have been explored
to improve the generalization capacity, these are not adapted for V-I person
ReID. In this paper, a specialized DA strategy is proposed to address this
multimodal setting. Given both the V and I modalities, this strategy allows to
diminish the impact of corruption on the accuracy of deep person ReID models.
Corruption may be modality-specific, and an additional modality often provides
complementary information. Our multimodal DA strategy is designed specifically
to encourage modality collaboration and reinforce generalization capability.
For instance, punctual masking of modalities forces the model to select the
informative modality. Local DA is also explored for advanced selection of
features within and among modalities. The impact of training baseline fusion
models for V-I person ReID using the proposed multimodal DA strategy is
assessed on corrupted versions of the SYSU-MM01, RegDB, and ThermalWORLD
datasets in terms of complexity and efficiency. Results indicate that using our
strategy provides V-I ReID models the ability to exploit both shared and
individual modality knowledge so they can outperform models trained with no or
unimodal DA. GitHub code: https://github.com/art2611/ML-MDA.
Related papers
- All in One Framework for Multimodal Re-identification in the Wild [58.380708329455466]
multimodal learning paradigm for ReID introduced, referred to as All-in-One (AIO)
AIO harnesses a frozen pre-trained big model as an encoder, enabling effective multimodal retrieval without additional fine-tuning.
Experiments on cross-modal and multimodal ReID reveal that AIO not only adeptly handles various modal data but also excels in challenging contexts.
arXiv Detail & Related papers (2024-05-08T01:04:36Z) - Cross-Modality Perturbation Synergy Attack for Person Re-identification [66.48494594909123]
The main challenge in cross-modality ReID lies in effectively dealing with visual differences between different modalities.
Existing attack methods have primarily focused on the characteristics of the visible image modality.
This study proposes a universal perturbation attack specifically designed for cross-modality ReID.
arXiv Detail & Related papers (2024-01-18T15:56:23Z) - Dynamic Enhancement Network for Partial Multi-modality Person
Re-identification [52.70235136651996]
We design a novel dynamic enhancement network (DENet), which allows missing arbitrary modalities while maintaining the representation ability of multiple modalities.
Since the missing state might be changeable, we design a dynamic enhancement module, which dynamically enhances modality features according to the missing state in an adaptive manner.
arXiv Detail & Related papers (2023-05-25T06:22:01Z) - Fusion for Visual-Infrared Person ReID in Real-World Surveillance Using
Corrupted Multimodal Data [10.816003787786766]
Visible-infrared person re-identification (V-I ReID) seeks to match images of individuals captured over a distributed network of RGB and IR cameras.
State-of-art V-I ReID models cannot leverage corrupted modality information to sustain a high level of accuracy.
We propose an efficient model for multimodal V-I ReID that preserves modality-specific knowledge for improved robustness to corrupted multimodal images.
arXiv Detail & Related papers (2023-04-29T18:18:59Z) - Learning Progressive Modality-shared Transformers for Effective
Visible-Infrared Person Re-identification [27.75907274034702]
We propose a novel deep learning framework named Progressive Modality-shared Transformer (PMT) for effective VI-ReID.
To reduce the negative effect of modality gaps, we first take the gray-scale images as an auxiliary modality and propose a progressive learning strategy.
To cope with the problem of large intra-class differences and small inter-class differences, we propose a Discriminative Center Loss.
arXiv Detail & Related papers (2022-12-01T02:20:16Z) - Multi-Scale Cascading Network with Compact Feature Learning for
RGB-Infrared Person Re-Identification [35.55895776505113]
Multi-Scale Part-Aware Cascading framework (MSPAC) is formulated by aggregating multi-scale fine-grained features from part to global.
Cross-modality correlations can thus be efficiently explored on salient features for distinctive modality-invariant feature learning.
arXiv Detail & Related papers (2020-12-12T15:39:11Z) - Learning Selective Mutual Attention and Contrast for RGB-D Saliency
Detection [145.4919781325014]
How to effectively fuse cross-modal information is the key problem for RGB-D salient object detection.
Many models use the feature fusion strategy but are limited by the low-order point-to-point fusion methods.
We propose a novel mutual attention model by fusing attention and contexts from different modalities.
arXiv Detail & Related papers (2020-10-12T08:50:10Z) - Cross-Resolution Adversarial Dual Network for Person Re-Identification
and Beyond [59.149653740463435]
Person re-identification (re-ID) aims at matching images of the same person across camera views.
Due to varying distances between cameras and persons of interest, resolution mismatch can be expected.
We propose a novel generative adversarial network to address cross-resolution person re-ID.
arXiv Detail & Related papers (2020-02-19T07:21:38Z) - Modality Compensation Network: Cross-Modal Adaptation for Action
Recognition [77.24983234113957]
We propose a Modality Compensation Network (MCN) to explore the relationships of different modalities.
Our model bridges data from source and auxiliary modalities by a modality adaptation block to achieve adaptive representation learning.
Experimental results reveal that MCN outperforms state-of-the-art approaches on four widely-used action recognition benchmarks.
arXiv Detail & Related papers (2020-01-31T04:51:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.