DiffBFR: Bootstrapping Diffusion Model Towards Blind Face Restoration
- URL: http://arxiv.org/abs/2305.04517v2
- Date: Tue, 8 Aug 2023 15:50:11 GMT
- Title: DiffBFR: Bootstrapping Diffusion Model Towards Blind Face Restoration
- Authors: Xinmin Qiu, Congying Han, Zicheng Zhang, Bonan Li, Tiande Guo,
Xuecheng Nie
- Abstract summary: We propose DiffBFR to introduce Diffusion Probabilistic Model (DPM) for Blind Face Restoration (BFR)
DPM avoids training collapse and generates long-tail distribution.
It restores identity information from low-quality images and then enhances texture details according to distribution of real faces.
- Score: 8.253458555695767
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Blind face restoration (BFR) is important while challenging. Prior works
prefer to exploit GAN-based frameworks to tackle this task due to the balance
of quality and efficiency. However, these methods suffer from poor stability
and adaptability to long-tail distribution, failing to simultaneously retain
source identity and restore detail. We propose DiffBFR to introduce Diffusion
Probabilistic Model (DPM) for BFR to tackle the above problem, given its
superiority over GAN in aspects of avoiding training collapse and generating
long-tail distribution. DiffBFR utilizes a two-step design, that first restores
identity information from low-quality images and then enhances texture details
according to the distribution of real faces. This design is implemented with
two key components: 1) Identity Restoration Module (IRM) for preserving the
face details in results. Instead of denoising from pure Gaussian random
distribution with LQ images as the condition during the reverse process, we
propose a novel truncated sampling method which starts from LQ images with part
noise added. We theoretically prove that this change shrinks the evidence lower
bound of DPM and then restores more original details. With theoretical proof,
two cascade conditional DPMs with different input sizes are introduced to
strengthen this sampling effect and reduce training difficulty in the
high-resolution image generated directly. 2) Texture Enhancement Module (TEM)
for polishing the texture of the image. Here an unconditional DPM, a LQ-free
model, is introduced to further force the restorations to appear realistic. We
theoretically proved that this unconditional DPM trained on pure HQ images
contributes to justifying the correct distribution of inference images output
from IRM in pixel-level space. Truncated sampling with fractional time step is
utilized to polish pixel-level textures while preserving identity information.
Related papers
- LAFR: Efficient Diffusion-based Blind Face Restoration via Latent Codebook Alignment Adapter [52.93785843453579]
Blind face restoration from low-quality (LQ) images is a challenging task that requires high-fidelity image reconstruction and the preservation of facial identity.<n>We propose LAFR, a novel codebook-based latent space adapter that aligns the latent distribution of LQ images with that of HQ counterparts.<n>We show that lightweight finetuning of diffusion prior on just 0.9% of FFHQ dataset is sufficient to achieve results comparable to state-of-the-art methods.
arXiv Detail & Related papers (2025-05-29T14:11:16Z) - DR-BFR: Degradation Representation with Diffusion Models for Blind Face Restoration [7.521850476177286]
We equip diffusion models with the capability to decouple various degradation as a degradation prompt from low-quality (LQ) face images.
Our novel restoration scheme, named DR-BFR, guides the denoising of Latent Diffusion Models (LDM) by incorporating Degradation Representation (DR) and content features from LQ images.
DR-BFR significantly outperforms state-of-the-art methods quantitatively and qualitatively across various datasets.
arXiv Detail & Related papers (2024-11-15T15:24:42Z) - One-step Generative Diffusion for Realistic Extreme Image Rescaling [47.89362819768323]
We propose a novel framework called One-Step Image Rescaling Diffusion (OSIRDiff) for extreme image rescaling.
OSIRDiff performs rescaling operations in the latent space of a pre-trained autoencoder.
It effectively leverages powerful natural image priors learned by a pre-trained text-to-image diffusion model.
arXiv Detail & Related papers (2024-08-17T09:51:42Z) - Improved Distribution Matching Distillation for Fast Image Synthesis [54.72356560597428]
We introduce DMD2, a set of techniques that lift this limitation and improve DMD training.
First, we eliminate the regression loss and the need for expensive dataset construction.
Second, we integrate a GAN loss into the distillation procedure, discriminating between generated samples and real images.
arXiv Detail & Related papers (2024-05-23T17:59:49Z) - BlindDiff: Empowering Degradation Modelling in Diffusion Models for Blind Image Super-Resolution [52.47005445345593]
BlindDiff is a DM-based blind SR method to tackle the blind degradation settings in SISR.
BlindDiff seamlessly integrates the MAP-based optimization into DMs.
Experiments on both synthetic and real-world datasets show that BlindDiff achieves the state-of-the-art performance.
arXiv Detail & Related papers (2024-03-15T11:21:34Z) - BFRFormer: Transformer-based generator for Real-World Blind Face
Restoration [37.77996097891398]
We propose a Transformer-based blind face restoration method, named BFRFormer, to reconstruct images with more identity-preserved details in an end-to-end manner.
Our method outperforms state-of-the-art methods on a synthetic dataset and four real-world datasets.
arXiv Detail & Related papers (2024-02-29T02:31:54Z) - Referee Can Play: An Alternative Approach to Conditional Generation via
Model Inversion [35.21106030549071]
Diffusion Probabilistic Models (DPMs) are dominant force in text-to-image generation tasks.
We propose an alternative view of state-of-the-art DPMs as a way of inverting advanced Vision-Language Models (VLMs)
By directly optimizing images with the supervision of discriminative VLMs, the proposed method can potentially achieve a better text-image alignment.
arXiv Detail & Related papers (2024-02-26T05:08:40Z) - Image Inpainting via Tractable Steering of Diffusion Models [54.13818673257381]
This paper proposes to exploit the ability of Tractable Probabilistic Models (TPMs) to exactly and efficiently compute the constrained posterior.
Specifically, this paper adopts a class of expressive TPMs termed Probabilistic Circuits (PCs)
We show that our approach can consistently improve the overall quality and semantic coherence of inpainted images with only 10% additional computational overhead.
arXiv Detail & Related papers (2023-11-28T21:14:02Z) - DiffBIR: Towards Blind Image Restoration with Generative Diffusion Prior [70.46245698746874]
We present DiffBIR, a general restoration pipeline that could handle different blind image restoration tasks.
DiffBIR decouples blind image restoration problem into two stages: 1) degradation removal: removing image-independent content; 2) information regeneration: generating the lost image content.
In the first stage, we use restoration modules to remove degradations and obtain high-fidelity restored results.
For the second stage, we propose IRControlNet that leverages the generative ability of latent diffusion models to generate realistic details.
arXiv Detail & Related papers (2023-08-29T07:11:52Z) - A Unified Conditional Framework for Diffusion-based Image Restoration [39.418415473235235]
We present a unified conditional framework based on diffusion models for image restoration.
We leverage a lightweight UNet to predict initial guidance and the diffusion model to learn the residual of the guidance.
To handle high-resolution images, we propose a simple yet effective inter-step patch-splitting strategy.
arXiv Detail & Related papers (2023-05-31T17:22:24Z) - Unsupervised Representation Learning from Pre-trained Diffusion
Probabilistic Models [83.75414370493289]
Diffusion Probabilistic Models (DPMs) have shown a powerful capacity of generating high-quality image samples.
Diff-AE have been proposed to explore DPMs for representation learning via autoencoding.
We propose textbfPre-trained textbfAutotextbfEncoding (textbfPDAE) to adapt existing pre-trained DPMs to the decoders for image reconstruction.
arXiv Detail & Related papers (2022-12-26T02:37:38Z) - DifFace: Blind Face Restoration with Diffused Error Contraction [62.476329680424975]
DifFace is capable of coping with unseen and complex degradations more gracefully without complicated loss designs.
It is superior to current state-of-the-art methods, especially in cases with severe degradations.
arXiv Detail & Related papers (2022-12-13T11:52:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.