Multi-Energy Guided Image Translation with Stochastic Differential
Equations for Near-Infrared Facial Expression Recognition
- URL: http://arxiv.org/abs/2312.05908v1
- Date: Sun, 10 Dec 2023 15:17:42 GMT
- Title: Multi-Energy Guided Image Translation with Stochastic Differential
Equations for Near-Infrared Facial Expression Recognition
- Authors: Bingjun Luo, Zewen Wang, Jinpeng Wang, Junjie Zhu, Xibin Zhao, Yue Gao
- Abstract summary: We present NIR-SDE, that transforms face expression between heterogeneous modalities overfitting on small-scale NIR data.
NFER-SDE significantly improves the performance of NIR FER and achieves state-of-the-art results on the only two available NIR FER datasets.
- Score: 32.34873680472637
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Illumination variation has been a long-term challenge in real-world facial
expression recognition(FER). Under uncontrolled or non-visible light
conditions, Near-infrared (NIR) can provide a simple and alternative solution
to obtain high-quality images and supplement the geometric and texture details
that are missing in the visible domain. Due to the lack of existing large-scale
NIR facial expression datasets, directly extending VIS FER methods to the NIR
spectrum may be ineffective. Additionally, previous heterogeneous image
synthesis methods are restricted by low controllability without prior task
knowledge. To tackle these issues, we present the first approach, called for
NIR-FER Stochastic Differential Equations (NFER-SDE), that transforms face
expression appearance between heterogeneous modalities to the overfitting
problem on small-scale NIR data. NFER-SDE is able to take the whole VIS source
image as input and, together with domain-specific knowledge, guide the
preservation of modality-invariant information in the high-frequency content of
the image. Extensive experiments and ablation studies show that NFER-SDE
significantly improves the performance of NIR FER and achieves state-of-the-art
results on the only two available NIR FER datasets, Oulu-CASIA and Large-HFE.
Related papers
- Contourlet Refinement Gate Framework for Thermal Spectrum Distribution Regularized Infrared Image Super-Resolution [54.293362972473595]
Image super-resolution (SR) aims to reconstruct high-resolution (HR) images from their low-resolution (LR) counterparts.
Current approaches to address SR tasks are either dedicated to extracting RGB image features or assuming similar degradation patterns.
We propose a Contourlet refinement gate framework to restore infrared modal-specific features while preserving spectral distribution fidelity.
arXiv Detail & Related papers (2024-11-19T14:24:03Z) - RN-SDEs: Limited-Angle CT Reconstruction with Residual Null-Space Diffusion Stochastic Differential Equations [11.83356524790835]
We propose Residual Null-Space Diffusion Differential Equations (RN-SDEs)
RN-SDEs are a variant of diffusion models that characterize the diffusion process with mean-reverting differential equations.
We show that by leveraging learned Mean-Reverting SDEs as a prior, RN-SDEs can restore high-quality images from severe degradation and achieve state-of-the-art performance in most LACT tasks.
arXiv Detail & Related papers (2024-09-20T22:33:36Z) - NIR-Assisted Image Denoising: A Selective Fusion Approach and A Real-World Benchmark Dataset [53.79524776100983]
Leveraging near-infrared (NIR) images to assist visible RGB image denoising shows the potential to address this issue.
Existing works still struggle with taking advantage of NIR information effectively for real-world image denoising.
We propose an efficient Selective Fusion Module (SFM), which can be plug-and-played into the advanced denoising networks.
arXiv Detail & Related papers (2024-04-12T14:54:26Z) - Hypergraph-Guided Disentangled Spectrum Transformer Networks for
Near-Infrared Facial Expression Recognition [31.783671943393344]
We give the first attempt to deep NIR facial expression recognition and proposed a novel method called near-infrared facial expression transformer (NFER-Former)
NFER-Former disentangles the expression information and spectrum information from the input image, so that the expression features can be extracted without the interference of spectrum variation.
We have constructed a large NIR-VIS Facial Expression dataset that includes 360 subjects to better validate the efficiency of NFER-Former.
arXiv Detail & Related papers (2023-12-10T15:15:50Z) - Rethinking the Domain Gap in Near-infrared Face Recognition [65.7871950460781]
Heterogeneous face recognition (HFR) involves the intricate task of matching face images across the visual domains of visible (VIS) and near-infrared (NIR)
Much of the existing literature on HFR identifies the domain gap as a primary challenge and directs efforts towards bridging it at either the input or feature level.
We observe that large neural networks, unlike their smaller counterparts, when pre-trained on large scale homogeneous VIS data, demonstrate exceptional zero-shot performance in HFR.
arXiv Detail & Related papers (2023-12-01T14:43:28Z) - Denoising Diffusion Models for Plug-and-Play Image Restoration [135.6359475784627]
This paper proposes DiffPIR, which integrates the traditional plug-and-play method into the diffusion sampling framework.
Compared to plug-and-play IR methods that rely on discriminative Gaussian denoisers, DiffPIR is expected to inherit the generative ability of diffusion models.
arXiv Detail & Related papers (2023-05-15T20:24:38Z) - Physically-Based Face Rendering for NIR-VIS Face Recognition [165.54414962403555]
Near infrared (NIR) to Visible (VIS) face matching is challenging due to the significant domain gaps.
We propose a novel method for paired NIR-VIS facial image generation.
To facilitate the identity feature learning, we propose an IDentity-based Maximum Mean Discrepancy (ID-MMD) loss.
arXiv Detail & Related papers (2022-11-11T18:48:16Z) - Unsupervised Misaligned Infrared and Visible Image Fusion via
Cross-Modality Image Generation and Registration [59.02821429555375]
We present a robust cross-modality generation-registration paradigm for unsupervised misaligned infrared and visible image fusion.
To better fuse the registered infrared images and visible images, we present a feature Interaction Fusion Module (IFM)
arXiv Detail & Related papers (2022-05-24T07:51:57Z) - Towards Homogeneous Modality Learning and Multi-Granularity Information
Exploration for Visible-Infrared Person Re-Identification [16.22986967958162]
Visible-infrared person re-identification (VI-ReID) is a challenging and essential task, which aims to retrieve a set of person images over visible and infrared camera views.
Previous methods attempt to apply generative adversarial network (GAN) to generate the modality-consisitent data.
In this work, we address cross-modality matching problem with Aligned Grayscale Modality (AGM), an unified dark-line spectrum that reformulates visible-infrared dual-mode learning as a gray-gray single-mode learning problem.
arXiv Detail & Related papers (2022-04-11T03:03:19Z) - Infrared Image Super-Resolution via Heterogeneous Convolutional WGAN [4.6667021835430145]
We present a framework that employs heterogeneous kernel-based super-resolution Wasserstein GAN (HetSRWGAN) for IR image super-resolution.
HetSRWGAN achieves consistently better performance in both qualitative and quantitative evaluations.
arXiv Detail & Related papers (2021-09-02T14:01:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.