Diverse Embedding Expansion Network and Low-Light Cross-Modality
Benchmark for Visible-Infrared Person Re-identification
- URL: http://arxiv.org/abs/2303.14481v1
- Date: Sat, 25 Mar 2023 14:24:56 GMT
- Title: Diverse Embedding Expansion Network and Low-Light Cross-Modality
Benchmark for Visible-Infrared Person Re-identification
- Authors: Yukang Zhang, Hanzi Wang
- Abstract summary: We propose a novel augmentation network in the embedding space, called diverse embedding expansion network (DEEN)
The proposed DEEN can effectively generate diverse embeddings to learn the informative feature representations.
We provide a low-light cross-modality (LLCM) dataset, which contains 46,767 bounding boxes of 1,064 identities captured by 9 RGB/IR cameras.
- Score: 26.71900654115498
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: For the visible-infrared person re-identification (VIReID) task, one of the
major challenges is the modality gaps between visible (VIS) and infrared (IR)
images. However, the training samples are usually limited, while the modality
gaps are too large, which leads that the existing methods cannot effectively
mine diverse cross-modality clues. To handle this limitation, we propose a
novel augmentation network in the embedding space, called diverse embedding
expansion network (DEEN). The proposed DEEN can effectively generate diverse
embeddings to learn the informative feature representations and reduce the
modality discrepancy between the VIS and IR images. Moreover, the VIReID model
may be seriously affected by drastic illumination changes, while all the
existing VIReID datasets are captured under sufficient illumination without
significant light changes. Thus, we provide a low-light cross-modality (LLCM)
dataset, which contains 46,767 bounding boxes of 1,064 identities captured by 9
RGB/IR cameras. Extensive experiments on the SYSU-MM01, RegDB and LLCM datasets
show the superiority of the proposed DEEN over several other state-of-the-art
methods. The code and dataset are released at: https://github.com/ZYK100/LLCM
Related papers
- Cross-Modality Perturbation Synergy Attack for Person Re-identification [66.48494594909123]
The main challenge in cross-modality ReID lies in effectively dealing with visual differences between different modalities.
Existing attack methods have primarily focused on the characteristics of the visible image modality.
This study proposes a universal perturbation attack specifically designed for cross-modality ReID.
arXiv Detail & Related papers (2024-01-18T15:56:23Z) - VI-Diff: Unpaired Visible-Infrared Translation Diffusion Model for
Single Modality Labeled Visible-Infrared Person Re-identification [14.749167141971952]
Cross-modality data annotation is costly and error-prone for Visible-Infrared person re-identification.
We propose VI-Diff, a diffusion model that effectively addresses the task of Visible-Infrared person image translation.
Our approach can be a promising solution to the VI-ReID task with single-modality labeled data and serves as a good starting point for future study.
arXiv Detail & Related papers (2023-10-06T09:42:12Z) - Dynamic Enhancement Network for Partial Multi-modality Person
Re-identification [52.70235136651996]
We design a novel dynamic enhancement network (DENet), which allows missing arbitrary modalities while maintaining the representation ability of multiple modalities.
Since the missing state might be changeable, we design a dynamic enhancement module, which dynamically enhances modality features according to the missing state in an adaptive manner.
arXiv Detail & Related papers (2023-05-25T06:22:01Z) - Flare-Aware Cross-modal Enhancement Network for Multi-spectral Vehicle
Re-identification [29.48387524901101]
In harsh environments, the discnative cues in RGB and NIR modalities are often lost due to strong flares from vehicle lamps or sunlight.
We propose a Flare-Aware Cross-modal Enhancement Network that adaptively restores flare-corrupted RGB and NIR features with guidance from the flareimmunized thermal infrared spectrum.
arXiv Detail & Related papers (2023-05-23T04:04:24Z) - Physically-Based Face Rendering for NIR-VIS Face Recognition [165.54414962403555]
Near infrared (NIR) to Visible (VIS) face matching is challenging due to the significant domain gaps.
We propose a novel method for paired NIR-VIS facial image generation.
To facilitate the identity feature learning, we propose an IDentity-based Maximum Mean Discrepancy (ID-MMD) loss.
arXiv Detail & Related papers (2022-11-11T18:48:16Z) - Unsupervised Misaligned Infrared and Visible Image Fusion via
Cross-Modality Image Generation and Registration [59.02821429555375]
We present a robust cross-modality generation-registration paradigm for unsupervised misaligned infrared and visible image fusion.
To better fuse the registered infrared images and visible images, we present a feature Interaction Fusion Module (IFM)
arXiv Detail & Related papers (2022-05-24T07:51:57Z) - Towards Homogeneous Modality Learning and Multi-Granularity Information
Exploration for Visible-Infrared Person Re-Identification [16.22986967958162]
Visible-infrared person re-identification (VI-ReID) is a challenging and essential task, which aims to retrieve a set of person images over visible and infrared camera views.
Previous methods attempt to apply generative adversarial network (GAN) to generate the modality-consisitent data.
In this work, we address cross-modality matching problem with Aligned Grayscale Modality (AGM), an unified dark-line spectrum that reformulates visible-infrared dual-mode learning as a gray-gray single-mode learning problem.
arXiv Detail & Related papers (2022-04-11T03:03:19Z) - RGB-D Saliency Detection via Cascaded Mutual Information Minimization [122.8879596830581]
Existing RGB-D saliency detection models do not explicitly encourage RGB and depth to achieve effective multi-modal learning.
We introduce a novel multi-stage cascaded learning framework via mutual information minimization to "explicitly" model the multi-modal information between RGB image and depth data.
arXiv Detail & Related papers (2021-09-15T12:31:27Z) - SFANet: A Spectrum-aware Feature Augmentation Network for
Visible-Infrared Person Re-Identification [12.566284647658053]
We propose a novel spectrum-aware feature augementation network named SFANet for cross-modality matching problem.
Learning with grayscale-spectrum images, our model can apparently reduce modality discrepancy and detect inner structure relations.
In feature-level, we improve the conventional two-stream network through balancing the number of specific and sharable convolutional blocks.
arXiv Detail & Related papers (2021-02-24T08:57:32Z) - Drone-based RGB-Infrared Cross-Modality Vehicle Detection via
Uncertainty-Aware Learning [59.19469551774703]
Drone-based vehicle detection aims at finding the vehicle locations and categories in an aerial image.
We construct a large-scale drone-based RGB-Infrared vehicle detection dataset, termed DroneVehicle.
Our DroneVehicle collects 28, 439 RGB-Infrared image pairs, covering urban roads, residential areas, parking lots, and other scenarios from day to night.
arXiv Detail & Related papers (2020-03-05T05:29:44Z) - Cross-Spectrum Dual-Subspace Pairing for RGB-infrared Cross-Modality
Person Re-Identification [15.475897856494583]
Conventional person re-identification can only handle RGB color images, which will fail at dark conditions.
RGB-infrared ReID (also known as Infrared-Visible ReID or Visible-Thermal ReID) is proposed.
In this paper, a novel multi-spectrum image generation method is proposed and the generated samples are utilized to help the network to find discriminative information.
arXiv Detail & Related papers (2020-02-29T09:01:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.