Flare-Aware Cross-modal Enhancement Network for Multi-spectral Vehicle
Re-identification
- URL: http://arxiv.org/abs/2305.13659v1
- Date: Tue, 23 May 2023 04:04:24 GMT
- Title: Flare-Aware Cross-modal Enhancement Network for Multi-spectral Vehicle
Re-identification
- Authors: Aihua Zheng, Zhiqi Ma, Zi Wang, Chenglong Li
- Abstract summary: In harsh environments, the discnative cues in RGB and NIR modalities are often lost due to strong flares from vehicle lamps or sunlight.
We propose a Flare-Aware Cross-modal Enhancement Network that adaptively restores flare-corrupted RGB and NIR features with guidance from the flareimmunized thermal infrared spectrum.
- Score: 29.48387524901101
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Multi-spectral vehicle re-identification aims to address the challenge of
identifying vehicles in complex lighting conditions by incorporating
complementary visible and infrared information. However, in harsh environments,
the discriminative cues in RGB and NIR modalities are often lost due to strong
flares from vehicle lamps or sunlight, and existing multi-modal fusion methods
are limited in their ability to recover these important cues. To address this
problem, we propose a Flare-Aware Cross-modal Enhancement Network that
adaptively restores flare-corrupted RGB and NIR features with guidance from the
flare-immunized thermal infrared spectrum. First, to reduce the influence of
locally degraded appearance due to intense flare, we propose a Mutual Flare
Mask Prediction module to jointly obtain flare-corrupted masks in RGB and NIR
modalities in a self-supervised manner. Second, to use the flare-immunized TI
information to enhance the masked RGB and NIR, we propose a Flare-Aware
Cross-modal Enhancement module that adaptively guides feature extraction of
masked RGB and NIR spectra with prior flare-immunized knowledge from the TI
spectrum. Third, to extract common informative semantic information from RGB
and NIR, we propose an Inter-modality Consistency loss that enforces semantic
consistency between the two modalities. Finally, to evaluate the proposed
FACENet in handling intense flare, we introduce a new multi-spectral vehicle
re-ID dataset, called WMVEID863, with additional challenges such as motion
blur, significant background changes, and particularly intense flare
degradation. Comprehensive experiments on both the newly collected dataset and
public benchmark multi-spectral vehicle re-ID datasets demonstrate the superior
performance of the proposed FACENet compared to state-of-the-art methods,
especially in handling strong flares. The code and dataset will be released
soon.
Related papers
- NIR-Assisted Image Denoising: A Selective Fusion Approach and A Real-World Benchmark Dataset [53.79524776100983]
Leveraging near-infrared (NIR) images to assist visible RGB image denoising shows the potential to address this issue.
Existing works still struggle with taking advantage of NIR information effectively for real-world image denoising.
We propose an efficient Selective Fusion Module (SFM), which can be plug-and-played into the advanced denoising networks.
arXiv Detail & Related papers (2024-04-12T14:54:26Z) - Removal then Selection: A Coarse-to-Fine Fusion Perspective for RGB-Infrared Object Detection [20.12812979315803]
Object detection utilizing both visible (RGB) and thermal infrared (IR) imagery has garnered extensive attention.
Most existing multi-modal object detection methods directly input the RGB and IR images into deep neural networks.
We propose a novel coarse-to-fine perspective to purify and fuse features from both modalities.
arXiv Detail & Related papers (2024-01-19T14:49:42Z) - Frequency Domain Nuances Mining for Visible-Infrared Person
Re-identification [75.87443138635432]
Existing methods mainly exploit the spatial information while ignoring the discriminative frequency information.
We propose a novel Frequency Domain Nuances Mining (FDNM) method to explore the cross-modality frequency domain information.
Our method outperforms the second-best method by 5.2% in Rank-1 accuracy and 5.8% in mAP on the SYSU-MM01 dataset.
arXiv Detail & Related papers (2024-01-04T09:19:54Z) - Frequency Domain Modality-invariant Feature Learning for
Visible-infrared Person Re-Identification [79.9402521412239]
We propose a novel Frequency Domain modality-invariant feature learning framework (FDMNet) to reduce modality discrepancy from the frequency domain perspective.
Our framework introduces two novel modules, namely the Instance-Adaptive Amplitude Filter (IAF) and the Phrase-Preserving Normalization (PPNorm)
arXiv Detail & Related papers (2024-01-03T17:11:27Z) - Hypergraph-Guided Disentangled Spectrum Transformer Networks for
Near-Infrared Facial Expression Recognition [31.783671943393344]
We give the first attempt to deep NIR facial expression recognition and proposed a novel method called near-infrared facial expression transformer (NFER-Former)
NFER-Former disentangles the expression information and spectrum information from the input image, so that the expression features can be extracted without the interference of spectrum variation.
We have constructed a large NIR-VIS Facial Expression dataset that includes 360 subjects to better validate the efficiency of NFER-Former.
arXiv Detail & Related papers (2023-12-10T15:15:50Z) - Diverse Embedding Expansion Network and Low-Light Cross-Modality
Benchmark for Visible-Infrared Person Re-identification [26.71900654115498]
We propose a novel augmentation network in the embedding space, called diverse embedding expansion network (DEEN)
The proposed DEEN can effectively generate diverse embeddings to learn the informative feature representations.
We provide a low-light cross-modality (LLCM) dataset, which contains 46,767 bounding boxes of 1,064 identities captured by 9 RGB/IR cameras.
arXiv Detail & Related papers (2023-03-25T14:24:56Z) - Unsupervised Misaligned Infrared and Visible Image Fusion via
Cross-Modality Image Generation and Registration [59.02821429555375]
We present a robust cross-modality generation-registration paradigm for unsupervised misaligned infrared and visible image fusion.
To better fuse the registered infrared images and visible images, we present a feature Interaction Fusion Module (IFM)
arXiv Detail & Related papers (2022-05-24T07:51:57Z) - SFANet: A Spectrum-aware Feature Augmentation Network for
Visible-Infrared Person Re-Identification [12.566284647658053]
We propose a novel spectrum-aware feature augementation network named SFANet for cross-modality matching problem.
Learning with grayscale-spectrum images, our model can apparently reduce modality discrepancy and detect inner structure relations.
In feature-level, we improve the conventional two-stream network through balancing the number of specific and sharable convolutional blocks.
arXiv Detail & Related papers (2021-02-24T08:57:32Z) - Learning Selective Mutual Attention and Contrast for RGB-D Saliency
Detection [145.4919781325014]
How to effectively fuse cross-modal information is the key problem for RGB-D salient object detection.
Many models use the feature fusion strategy but are limited by the low-order point-to-point fusion methods.
We propose a novel mutual attention model by fusing attention and contexts from different modalities.
arXiv Detail & Related papers (2020-10-12T08:50:10Z) - Drone-based RGB-Infrared Cross-Modality Vehicle Detection via
Uncertainty-Aware Learning [59.19469551774703]
Drone-based vehicle detection aims at finding the vehicle locations and categories in an aerial image.
We construct a large-scale drone-based RGB-Infrared vehicle detection dataset, termed DroneVehicle.
Our DroneVehicle collects 28, 439 RGB-Infrared image pairs, covering urban roads, residential areas, parking lots, and other scenarios from day to night.
arXiv Detail & Related papers (2020-03-05T05:29:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.