CLR-Face: Conditional Latent Refinement for Blind Face Restoration Using
Score-Based Diffusion Models
- URL: http://arxiv.org/abs/2402.06106v1
- Date: Thu, 8 Feb 2024 23:51:49 GMT
- Title: CLR-Face: Conditional Latent Refinement for Blind Face Restoration Using
Score-Based Diffusion Models
- Authors: Maitreya Suin, Rama Chellappa
- Abstract summary: Recent generative-prior-based methods have shown promising blind face restoration performance.
Generating fine-grained facial details faithful to inputs remains a challenging problem.
We introduce a diffusion-based-prior inside a VQGAN architecture that focuses on learning the distribution over uncorrupted latent embeddings.
- Score: 57.9771859175664
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent generative-prior-based methods have shown promising blind face
restoration performance. They usually project the degraded images to the latent
space and then decode high-quality faces either by single-stage latent
optimization or directly from the encoding. Generating fine-grained facial
details faithful to inputs remains a challenging problem. Most existing methods
produce either overly smooth outputs or alter the identity as they attempt to
balance between generation and reconstruction. This may be attributed to the
typical trade-off between quality and resolution in the latent space. If the
latent space is highly compressed, the decoded output is more robust to
degradations but shows worse fidelity. On the other hand, a more flexible
latent space can capture intricate facial details better, but is extremely
difficult to optimize for highly degraded faces using existing techniques. To
address these issues, we introduce a diffusion-based-prior inside a VQGAN
architecture that focuses on learning the distribution over uncorrupted latent
embeddings. With such knowledge, we iteratively recover the clean embedding
conditioning on the degraded counterpart. Furthermore, to ensure the reverse
diffusion trajectory does not deviate from the underlying identity, we train a
separate Identity Recovery Network and use its output to constrain the reverse
diffusion process. Specifically, using a learnable latent mask, we add
gradients from a face-recognition network to a subset of latent features that
correlates with the finer identity-related details in the pixel space, leaving
the other features untouched. Disentanglement between perception and fidelity
in the latent space allows us to achieve the best of both worlds. We perform
extensive evaluations on multiple real and synthetic datasets to validate the
superiority of our approach.
Related papers
- DynFaceRestore: Balancing Fidelity and Quality in Diffusion-Guided Blind Face Restoration with Dynamic Blur-Level Mapping and Guidance [23.898755072744425]
Blind Face Restoration aims to recover high-fidelity, detail-rich facial images from unknown degraded inputs.<n>We propose DynFaceRestore, a novel blind face restoration approach that learns to map any blindly degraded input to blurry images.<n>DynFaceRestore achieves state-of-the-art performance in both quantitative and qualitative evaluations.
arXiv Detail & Related papers (2025-07-18T10:16:08Z) - LAFR: Efficient Diffusion-based Blind Face Restoration via Latent Codebook Alignment Adapter [52.93785843453579]
Blind face restoration from low-quality (LQ) images is a challenging task that requires high-fidelity image reconstruction and the preservation of facial identity.<n>We propose LAFR, a novel codebook-based latent space adapter that aligns the latent distribution of LQ images with that of HQ counterparts.<n>We show that lightweight finetuning of diffusion prior on just 0.9% of FFHQ dataset is sufficient to achieve results comparable to state-of-the-art methods.
arXiv Detail & Related papers (2025-05-29T14:11:16Z) - DiffusionReward: Enhancing Blind Face Restoration through Reward Feedback Learning [40.641049729447175]
We introduce a ReFL framework, named DiffusionReward, to the Blind Face Restoration task for the first time.<n>The core of our framework is the Face Reward Model (FRM), which is trained using carefully annotated data.<n>Experiments on synthetic and wild datasets demonstrate that our method outperforms state-of-the-art methods.
arXiv Detail & Related papers (2025-05-23T13:53:23Z) - OSDFace: One-Step Diffusion Model for Face Restoration [72.5045389847792]
Diffusion models have demonstrated impressive performance in face restoration.
We propose OSDFace, a novel one-step diffusion model for face restoration.
Results demonstrate that OSDFace surpasses current state-of-the-art (SOTA) methods in both visual quality and quantitative metrics.
arXiv Detail & Related papers (2024-11-26T07:07:48Z) - DiffMAC: Diffusion Manifold Hallucination Correction for High Generalization Blind Face Restoration [62.44659039265439]
We propose a Diffusion-Information-Diffusion framework to tackle blind face restoration.
DiffMAC achieves high-generalization face restoration in diverse degraded scenes and heterogeneous domains.
Results demonstrate the superiority of DiffMAC over state-of-the-art methods.
arXiv Detail & Related papers (2024-03-15T08:44:15Z) - SARGAN: Spatial Attention-based Residuals for Facial Expression
Manipulation [1.7056768055368383]
We present a novel method named SARGAN that addresses the limitations from three perspectives.
We exploited a symmetric encoder-decoder network to attend facial features at multiple scales.
Our proposed model performs significantly better than state-of-the-art methods.
arXiv Detail & Related papers (2023-03-30T08:15:18Z) - DifFace: Blind Face Restoration with Diffused Error Contraction [62.476329680424975]
DifFace is capable of coping with unseen and complex degradations more gracefully without complicated loss designs.
It is superior to current state-of-the-art methods, especially in cases with severe degradations.
arXiv Detail & Related papers (2022-12-13T11:52:33Z) - High-resolution Face Swapping via Latent Semantics Disentanglement [50.23624681222619]
We present a novel high-resolution hallucination face swapping method using the inherent prior knowledge of a pre-trained GAN model.
We explicitly disentangle the latent semantics by utilizing the progressive nature of the generator.
We extend our method to video face swapping by enforcing two-temporal constraints on the latent space and the image space.
arXiv Detail & Related papers (2022-03-30T00:33:08Z) - SuperFront: From Low-resolution to High-resolution Frontal Face
Synthesis [65.35922024067551]
We propose a generative adversarial network (GAN) -based model to generate high-quality, identity preserving frontal faces.
Specifically, we propose SuperFront-GAN to synthesize a high-resolution (HR), frontal face from one-to-many LR faces with various poses.
We integrate a super-resolution side-view module into SF-GAN to preserve identity information and fine details of the side-views in HR space.
arXiv Detail & Related papers (2020-12-07T23:30:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.