Related papers: Multi-Energy Guided Image Translation with Stochastic Differential Equations for Near-Infrared Facial Expression Recognition

Multi-Energy Guided Image Translation with Stochastic Differential Equations for Near-Infrared Facial Expression Recognition

URL: http://arxiv.org/abs/2312.05908v1
Date: Sun, 10 Dec 2023 15:17:42 GMT
Title: Multi-Energy Guided Image Translation with Stochastic Differential Equations for Near-Infrared Facial Expression Recognition
Authors: Bingjun Luo, Zewen Wang, Jinpeng Wang, Junjie Zhu, Xibin Zhao, Yue Gao
Abstract summary: We present NIR-SDE, that transforms face expression between heterogeneous modalities overfitting on small-scale NIR data. NFER-SDE significantly improves the performance of NIR FER and achieves state-of-the-art results on the only two available NIR FER datasets.
Score: 32.34873680472637
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Illumination variation has been a long-term challenge in real-world facial expression recognition(FER). Under uncontrolled or non-visible light conditions, Near-infrared (NIR) can provide a simple and alternative solution to obtain high-quality images and supplement the geometric and texture details that are missing in the visible domain. Due to the lack of existing large-scale NIR facial expression datasets, directly extending VIS FER methods to the NIR spectrum may be ineffective. Additionally, previous heterogeneous image synthesis methods are restricted by low controllability without prior task knowledge. To tackle these issues, we present the first approach, called for NIR-FER Stochastic Differential Equations (NFER-SDE), that transforms face expression appearance between heterogeneous modalities to the overfitting problem on small-scale NIR data. NFER-SDE is able to take the whole VIS source image as input and, together with domain-specific knowledge, guide the preservation of modality-invariant information in the high-frequency content of the image. Extensive experiments and ablation studies show that NFER-SDE significantly improves the performance of NIR FER and achieves state-of-the-art results on the only two available NIR FER datasets, Oulu-CASIA and Large-HFE.

Related papers

Infrared and Visible Image Fusion Based on Implicit Neural Representations [3.8530055385287403]
Infrared and visible light image fusion aims to combine the strengths of both modalities to generate images that are rich in information.<n>This paper proposes an image fusion method based on Implicit Neural Representations (INR), referred to as INRFuse.<n> Experimental results indicate that INRFuse outperforms existing methods in both subjective visual quality and objective evaluation metrics.
arXiv Detail & Related papers (2025-06-20T06:34:19Z)
Multi-Domain Biometric Recognition using Body Embeddings [51.36007967653781]
We show that body embeddings perform better than face embeddings in medium-wave infrared (MWIR) and long-wave infrared (LWIR) domains. We leverage a vision transformer architecture to establish benchmark results on the IJB-MDF dataset. We also show that finetuning a body model, pretrained exclusively on VIS data, with a simple combination of cross-entropy and triplet losses achieves state-of-the-art mAP scores.
arXiv Detail & Related papers (2025-03-13T22:38:18Z)
Contourlet Refinement Gate Framework for Thermal Spectrum Distribution Regularized Infrared Image Super-Resolution [54.293362972473595]
Image super-resolution (SR) aims to reconstruct high-resolution (HR) images from their low-resolution (LR) counterparts. Current approaches to address SR tasks are either dedicated to extracting RGB image features or assuming similar degradation patterns. We propose a Contourlet refinement gate framework to restore infrared modal-specific features while preserving spectral distribution fidelity.
arXiv Detail & Related papers (2024-11-19T14:24:03Z)
RN-SDEs: Limited-Angle CT Reconstruction with Residual Null-Space Diffusion Stochastic Differential Equations [11.83356524790835]
We propose Residual Null-Space Diffusion Differential Equations (RN-SDEs) RN-SDEs are a variant of diffusion models that characterize the diffusion process with mean-reverting differential equations. We show that by leveraging learned Mean-Reverting SDEs as a prior, RN-SDEs can restore high-quality images from severe degradation and achieve state-of-the-art performance in most LACT tasks.
arXiv Detail & Related papers (2024-09-20T22:33:36Z)
NIR-Assisted Image Denoising: A Selective Fusion Approach and A Real-World Benchmark Dataset [53.79524776100983]
Leveraging near-infrared (NIR) images to assist visible RGB image denoising shows the potential to address this issue. Existing works still struggle with taking advantage of NIR information effectively for real-world image denoising. We propose an efficient Selective Fusion Module (SFM), which can be plug-and-played into the advanced denoising networks.
arXiv Detail & Related papers (2024-04-12T14:54:26Z)
Hypergraph-Guided Disentangled Spectrum Transformer Networks for Near-Infrared Facial Expression Recognition [31.783671943393344]
We give the first attempt to deep NIR facial expression recognition and proposed a novel method called near-infrared facial expression transformer (NFER-Former) NFER-Former disentangles the expression information and spectrum information from the input image, so that the expression features can be extracted without the interference of spectrum variation. We have constructed a large NIR-VIS Facial Expression dataset that includes 360 subjects to better validate the efficiency of NFER-Former.
arXiv Detail & Related papers (2023-12-10T15:15:50Z)
Rethinking the Domain Gap in Near-infrared Face Recognition [65.7871950460781]
Heterogeneous face recognition (HFR) involves the intricate task of matching face images across the visual domains of visible (VIS) and near-infrared (NIR) Much of the existing literature on HFR identifies the domain gap as a primary challenge and directs efforts towards bridging it at either the input or feature level. We observe that large neural networks, unlike their smaller counterparts, when pre-trained on large scale homogeneous VIS data, demonstrate exceptional zero-shot performance in HFR.
arXiv Detail & Related papers (2023-12-01T14:43:28Z)
Denoising Diffusion Models for Plug-and-Play Image Restoration [135.6359475784627]
This paper proposes DiffPIR, which integrates the traditional plug-and-play method into the diffusion sampling framework. Compared to plug-and-play IR methods that rely on discriminative Gaussian denoisers, DiffPIR is expected to inherit the generative ability of diffusion models.
arXiv Detail & Related papers (2023-05-15T20:24:38Z)
Physically-Based Face Rendering for NIR-VIS Face Recognition [165.54414962403555]
Near infrared (NIR) to Visible (VIS) face matching is challenging due to the significant domain gaps. We propose a novel method for paired NIR-VIS facial image generation. To facilitate the identity feature learning, we propose an IDentity-based Maximum Mean Discrepancy (ID-MMD) loss.
arXiv Detail & Related papers (2022-11-11T18:48:16Z)
Unsupervised Misaligned Infrared and Visible Image Fusion via Cross-Modality Image Generation and Registration [59.02821429555375]
We present a robust cross-modality generation-registration paradigm for unsupervised misaligned infrared and visible image fusion. To better fuse the registered infrared images and visible images, we present a feature Interaction Fusion Module (IFM)
arXiv Detail & Related papers (2022-05-24T07:51:57Z)
Towards Homogeneous Modality Learning and Multi-Granularity Information Exploration for Visible-Infrared Person Re-Identification [16.22986967958162]
Visible-infrared person re-identification (VI-ReID) is a challenging and essential task, which aims to retrieve a set of person images over visible and infrared camera views. Previous methods attempt to apply generative adversarial network (GAN) to generate the modality-consisitent data. In this work, we address cross-modality matching problem with Aligned Grayscale Modality (AGM), an unified dark-line spectrum that reformulates visible-infrared dual-mode learning as a gray-gray single-mode learning problem.
arXiv Detail & Related papers (2022-04-11T03:03:19Z)
Infrared Image Super-Resolution via Heterogeneous Convolutional WGAN [4.6667021835430145]
We present a framework that employs heterogeneous kernel-based super-resolution Wasserstein GAN (HetSRWGAN) for IR image super-resolution. HetSRWGAN achieves consistently better performance in both qualitative and quantitative evaluations.
arXiv Detail & Related papers (2021-09-02T14:01:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.