Related papers: CLR-Face: Conditional Latent Refinement for Blind Face Restoration Using Score-Based Diffusion Models

CLR-Face: Conditional Latent Refinement for Blind Face Restoration Using Score-Based Diffusion Models

URL: http://arxiv.org/abs/2402.06106v1
Date: Thu, 8 Feb 2024 23:51:49 GMT
Title: CLR-Face: Conditional Latent Refinement for Blind Face Restoration Using Score-Based Diffusion Models
Authors: Maitreya Suin, Rama Chellappa
Abstract summary: Recent generative-prior-based methods have shown promising blind face restoration performance. Generating fine-grained facial details faithful to inputs remains a challenging problem. We introduce a diffusion-based-prior inside a VQGAN architecture that focuses on learning the distribution over uncorrupted latent embeddings.
Score: 57.9771859175664
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent generative-prior-based methods have shown promising blind face restoration performance. They usually project the degraded images to the latent space and then decode high-quality faces either by single-stage latent optimization or directly from the encoding. Generating fine-grained facial details faithful to inputs remains a challenging problem. Most existing methods produce either overly smooth outputs or alter the identity as they attempt to balance between generation and reconstruction. This may be attributed to the typical trade-off between quality and resolution in the latent space. If the latent space is highly compressed, the decoded output is more robust to degradations but shows worse fidelity. On the other hand, a more flexible latent space can capture intricate facial details better, but is extremely difficult to optimize for highly degraded faces using existing techniques. To address these issues, we introduce a diffusion-based-prior inside a VQGAN architecture that focuses on learning the distribution over uncorrupted latent embeddings. With such knowledge, we iteratively recover the clean embedding conditioning on the degraded counterpart. Furthermore, to ensure the reverse diffusion trajectory does not deviate from the underlying identity, we train a separate Identity Recovery Network and use its output to constrain the reverse diffusion process. Specifically, using a learnable latent mask, we add gradients from a face-recognition network to a subset of latent features that correlates with the finer identity-related details in the pixel space, leaving the other features untouched. Disentanglement between perception and fidelity in the latent space allows us to achieve the best of both worlds. We perform extensive evaluations on multiple real and synthetic datasets to validate the superiority of our approach.

Related papers

DynFaceRestore: Balancing Fidelity and Quality in Diffusion-Guided Blind Face Restoration with Dynamic Blur-Level Mapping and Guidance [23.898755072744425]
Blind Face Restoration aims to recover high-fidelity, detail-rich facial images from unknown degraded inputs.<n>We propose DynFaceRestore, a novel blind face restoration approach that learns to map any blindly degraded input to blurry images.<n>DynFaceRestore achieves state-of-the-art performance in both quantitative and qualitative evaluations.
arXiv Detail & Related papers (2025-07-18T10:16:08Z)
LAFR: Efficient Diffusion-based Blind Face Restoration via Latent Codebook Alignment Adapter [52.93785843453579]
Blind face restoration from low-quality (LQ) images is a challenging task that requires high-fidelity image reconstruction and the preservation of facial identity.<n>We propose LAFR, a novel codebook-based latent space adapter that aligns the latent distribution of LQ images with that of HQ counterparts.<n>We show that lightweight finetuning of diffusion prior on just 0.9% of FFHQ dataset is sufficient to achieve results comparable to state-of-the-art methods.
arXiv Detail & Related papers (2025-05-29T14:11:16Z)
DiffusionReward: Enhancing Blind Face Restoration through Reward Feedback Learning [40.641049729447175]
We introduce a ReFL framework, named DiffusionReward, to the Blind Face Restoration task for the first time.<n>The core of our framework is the Face Reward Model (FRM), which is trained using carefully annotated data.<n>Experiments on synthetic and wild datasets demonstrate that our method outperforms state-of-the-art methods.
arXiv Detail & Related papers (2025-05-23T13:53:23Z)
OSDFace: One-Step Diffusion Model for Face Restoration [72.5045389847792]
Diffusion models have demonstrated impressive performance in face restoration. We propose OSDFace, a novel one-step diffusion model for face restoration. Results demonstrate that OSDFace surpasses current state-of-the-art (SOTA) methods in both visual quality and quantitative metrics.
arXiv Detail & Related papers (2024-11-26T07:07:48Z)
DiffMAC: Diffusion Manifold Hallucination Correction for High Generalization Blind Face Restoration [62.44659039265439]
We propose a Diffusion-Information-Diffusion framework to tackle blind face restoration. DiffMAC achieves high-generalization face restoration in diverse degraded scenes and heterogeneous domains. Results demonstrate the superiority of DiffMAC over state-of-the-art methods.
arXiv Detail & Related papers (2024-03-15T08:44:15Z)
SARGAN: Spatial Attention-based Residuals for Facial Expression Manipulation [1.7056768055368383]
We present a novel method named SARGAN that addresses the limitations from three perspectives. We exploited a symmetric encoder-decoder network to attend facial features at multiple scales. Our proposed model performs significantly better than state-of-the-art methods.
arXiv Detail & Related papers (2023-03-30T08:15:18Z)
DifFace: Blind Face Restoration with Diffused Error Contraction [62.476329680424975]
DifFace is capable of coping with unseen and complex degradations more gracefully without complicated loss designs. It is superior to current state-of-the-art methods, especially in cases with severe degradations.
arXiv Detail & Related papers (2022-12-13T11:52:33Z)
High-resolution Face Swapping via Latent Semantics Disentanglement [50.23624681222619]
We present a novel high-resolution hallucination face swapping method using the inherent prior knowledge of a pre-trained GAN model. We explicitly disentangle the latent semantics by utilizing the progressive nature of the generator. We extend our method to video face swapping by enforcing two-temporal constraints on the latent space and the image space.
arXiv Detail & Related papers (2022-03-30T00:33:08Z)
SuperFront: From Low-resolution to High-resolution Frontal Face Synthesis [65.35922024067551]
We propose a generative adversarial network (GAN) -based model to generate high-quality, identity preserving frontal faces. Specifically, we propose SuperFront-GAN to synthesize a high-resolution (HR), frontal face from one-to-many LR faces with various poses. We integrate a super-resolution side-view module into SF-GAN to preserve identity information and fine details of the side-views in HR space.
arXiv Detail & Related papers (2020-12-07T23:30:28Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.