LRDif: Diffusion Models for Under-Display Camera Emotion Recognition
- URL: http://arxiv.org/abs/2402.00250v1
- Date: Thu, 1 Feb 2024 00:19:57 GMT
- Title: LRDif: Diffusion Models for Under-Display Camera Emotion Recognition
- Authors: Zhifeng Wang and Kaihao Zhang and Ramesh Sankaranarayana
- Abstract summary: This study introduces LRDif, a novel diffusion-based framework designed specifically for facial expression recognition (FER)
To address the inherent challenges posed by UDC's image degradation, LRDif employs a two-stage training strategy that integrates a condensed preliminary extraction network (FPEN) and an agile transformer network (UDCformer)
- Score: 16.965454529686177
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This study introduces LRDif, a novel diffusion-based framework designed
specifically for facial expression recognition (FER) within the context of
under-display cameras (UDC). To address the inherent challenges posed by UDC's
image degradation, such as reduced sharpness and increased noise, LRDif employs
a two-stage training strategy that integrates a condensed preliminary
extraction network (FPEN) and an agile transformer network (UDCformer) to
effectively identify emotion labels from UDC images. By harnessing the robust
distribution mapping capabilities of Diffusion Models (DMs) and the spatial
dependency modeling strength of transformers, LRDif effectively overcomes the
obstacles of noise and distortion inherent in UDC environments. Comprehensive
experiments on standard FER datasets including RAF-DB, KDEF, and FERPlus, LRDif
demonstrate state-of-the-art performance, underscoring its potential in
advancing FER applications. This work not only addresses a significant gap in
the literature by tackling the UDC challenge in FER but also sets a new
benchmark for future research in the field.
Related papers
- TIR-Diffusion: Diffusion-based Thermal Infrared Image Denoising via Latent and Wavelet Domain Optimization [11.970228442183476]
We propose a diffusion-based TIR image denoising framework.<n>Our method fine-tunes the model via a novel loss function combining latent-space and discrete wavelet transform (DWT) / dual-tree complex wavelet transform (DTCWT) losses.<n> Experiments on benchmark datasets demonstrate superior performance of our approach compared to state-of-the-art denoising methods.
arXiv Detail & Related papers (2025-07-30T06:27:32Z) - Controllable Reference-Based Real-World Remote Sensing Image Super-Resolution with Generative Diffusion Priors [13.148815217684277]
Super-resolution (SR) techniques can enhance the spatial resolution of remote sensing images by utilizing low-resolution (LR) images to reconstruct high-resolution (HR) images.<n>Existing RefSR methods struggle with real-world complexities, such as cross-sensor resolution gap and significant land cover changes.<n>We propose CRefDiff, a novel controllable reference-based diffusion model for real-world remote sensing image SR.
arXiv Detail & Related papers (2025-06-30T12:45:28Z) - InstaRevive: One-Step Image Enhancement via Dynamic Score Matching [66.97989469865828]
InstaRevive is an image enhancement framework that employs score-based diffusion distillation to harness potent generative capability.
Our framework delivers high-quality and visually appealing results across a diverse array of challenging tasks and datasets.
arXiv Detail & Related papers (2025-04-22T01:19:53Z) - FreSca: Scaling in Frequency Space Enhances Diffusion Models [55.75504192166779]
This paper explores frequency-based control within latent diffusion models.<n>We introduce FreSca, a novel framework that decomposes noise difference into low- and high-frequency components.<n>FreSca operates without any model retraining or architectural change, offering model- and task-agnostic control.
arXiv Detail & Related papers (2025-04-02T22:03:11Z) - Reconciling Stochastic and Deterministic Strategies for Zero-shot Image Restoration using Diffusion Model in Dual [47.141811103506036]
We propose a novel zero-shot image restoration scheme dubbed Reconciling Model in Dual (RDMD)
RDMD uses only a bftextsingle pre-trained diffusion model to construct texttwo regularizers.
Our proposed method could achieve superior results compared to existing approaches on both the FFHQ and ImageNet datasets.
arXiv Detail & Related papers (2025-03-03T08:25:22Z) - C-DiffSET: Leveraging Latent Diffusion for SAR-to-EO Image Translation with Confidence-Guided Reliable Object Generation [23.63992950769041]
C-DiffSET is a framework leveraging pretrained Latent Diffusion Model (LDM) extensively trained on natural images.
Remarkably, we find that the pretrained VAE encoder aligns SAR and EO images in the same latent space, even with varying noise levels in SAR inputs.
arXiv Detail & Related papers (2024-11-16T12:28:40Z) - Effective Diffusion Transformer Architecture for Image Super-Resolution [63.254644431016345]
We design an effective diffusion transformer for image super-resolution (DiT-SR)
In practice, DiT-SR leverages an overall U-shaped architecture, and adopts a uniform isotropic design for all the transformer blocks.
We analyze the limitation of the widely used AdaLN, and present a frequency-adaptive time-step conditioning module.
arXiv Detail & Related papers (2024-09-29T07:14:16Z) - Retinex-RAWMamba: Bridging Demosaicing and Denoising for Low-Light RAW Image Enhancement [71.13353154514418]
Low-light image enhancement, particularly in cross-domain tasks such as mapping from the raw domain to the sRGB domain, remains a significant challenge.
We present a novel Mamba scanning mechanism, called RAWMamba, to effectively handle raw images with different CFAs.
We also present a Retinex Decomposition Module (RDM) grounded in Retinex prior, which decouples illumination from reflectance to facilitate more effective denoising and automatic non-linear exposure correction.
arXiv Detail & Related papers (2024-09-11T06:12:03Z) - LLDif: Diffusion Models for Low-light Emotion Recognition [15.095166627983566]
This paper introduces LLDif, a novel diffusion-based facial expression recognition (FER) framework tailored for extremely low-light (LL) environments.
Images captured under such conditions often suffer from low brightness and significantly reduced contrast, presenting challenges to conventional methods.
LLDif addresses these issues with a novel two-stage training process that combines a Label-aware CLIP (LA-CLIP), an embedding prior network (PNET) and a transformer-based network adept at handling the noise of low-light images.
arXiv Detail & Related papers (2024-08-08T05:41:09Z) - AEROBLADE: Training-Free Detection of Latent Diffusion Images Using Autoencoder Reconstruction Error [15.46508882889489]
A key enabler for generating high-resolution images with low computational cost has been the development of latent diffusion models (LDMs)
LDMs perform the denoising process in the low-dimensional latent space of a pre-trained autoencoder (AE) instead of the high-dimensional image space.
We propose a novel detection method which exploits an inherent component of LDMs: the AE used to transform images between image and latent space.
arXiv Detail & Related papers (2024-01-31T14:36:49Z) - Reinforcement Learning for SAR View Angle Inversion with Differentiable
SAR Renderer [7.112962861847319]
This study aims to reverse radar view angles in synthetic aperture radar (SAR) images given a target model.
An electromagnetic simulator named differentiable SAR render (DSR) is embedded to facilitate the interaction between the agent and the environment.
arXiv Detail & Related papers (2024-01-02T11:47:58Z) - LDM-ISP: Enhancing Neural ISP for Low Light with Latent Diffusion Models [54.93010869546011]
We propose to leverage the pre-trained latent diffusion model to perform the neural ISP for enhancing extremely low-light images.
Specifically, to tailor the pre-trained latent diffusion model to operate on the RAW domain, we train a set of lightweight taming modules.
We observe different roles of UNet denoising and decoder reconstruction in the latent diffusion model, which inspires us to decompose the low-light image enhancement task into latent-space low-frequency content generation and decoding-phase high-frequency detail maintenance.
arXiv Detail & Related papers (2023-12-02T04:31:51Z) - Steerable Conditional Diffusion for Out-of-Distribution Adaptation in Medical Image Reconstruction [75.91471250967703]
We introduce a novel sampling framework called Steerable Conditional Diffusion.
This framework adapts the diffusion model, concurrently with image reconstruction, based solely on the information provided by the available measurement.
We achieve substantial enhancements in out-of-distribution performance across diverse imaging modalities.
arXiv Detail & Related papers (2023-08-28T08:47:06Z) - Low-Light Image Enhancement with Wavelet-based Diffusion Models [50.632343822790006]
Diffusion models have achieved promising results in image restoration tasks, yet suffer from time-consuming, excessive computational resource consumption, and unstable restoration.
We propose a robust and efficient Diffusion-based Low-Light image enhancement approach, dubbed DiffLL.
arXiv Detail & Related papers (2023-06-01T03:08:28Z) - DuDGAN: Improving Class-Conditional GANs via Dual-Diffusion [2.458437232470188]
Class-conditional image generation using generative adversarial networks (GANs) has been investigated through various techniques.
We propose a novel approach for class-conditional image generation using GANs called DuDGAN, which incorporates a dual diffusion-based noise injection process.
Our method outperforms state-of-the-art conditional GAN models for image generation in terms of performance.
arXiv Detail & Related papers (2023-05-24T07:59:44Z) - Denoising Diffusion Models for Plug-and-Play Image Restoration [135.6359475784627]
This paper proposes DiffPIR, which integrates the traditional plug-and-play method into the diffusion sampling framework.
Compared to plug-and-play IR methods that rely on discriminative Gaussian denoisers, DiffPIR is expected to inherit the generative ability of diffusion models.
arXiv Detail & Related papers (2023-05-15T20:24:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.