Related papers: Towards Robust Blind Face Restoration with Codebook Lookup Transformer

Towards Robust Blind Face Restoration with Codebook Lookup Transformer

URL: http://arxiv.org/abs/2206.11253v1
Date: Wed, 22 Jun 2022 17:58:01 GMT
Title: Towards Robust Blind Face Restoration with Codebook Lookup Transformer
Authors: Shangchen Zhou, Kelvin C.K. Chan, Chongyi Li, Chen Change Loy
Abstract summary: Blind face restoration is a highly ill-posed problem that often requires auxiliary guidance. We show that a learned discrete codebook prior in a small proxy space cast blind face restoration as a code prediction task. We propose a Transformer-based prediction network, named CodeFormer, to model global composition and context of the low-quality faces.
Score: 94.48731935629066
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Blind face restoration is a highly ill-posed problem that often requires auxiliary guidance to 1) improve the mapping from degraded inputs to desired outputs, or 2) complement high-quality details lost in the inputs. In this paper, we demonstrate that a learned discrete codebook prior in a small proxy space largely reduces the uncertainty and ambiguity of restoration mapping by casting blind face restoration as a code prediction task, while providing rich visual atoms for generating high-quality faces. Under this paradigm, we propose a Transformer-based prediction network, named CodeFormer, to model global composition and context of the low-quality faces for code prediction, enabling the discovery of natural faces that closely approximate the target faces even when the inputs are severely degraded. To enhance the adaptiveness for different degradation, we also propose a controllable feature transformation module that allows a flexible trade-off between fidelity and quality. Thanks to the expressive codebook prior and global modeling, CodeFormer outperforms state of the arts in both quality and fidelity, showing superior robustness to degradation. Extensive experimental results on synthetic and real-world datasets verify the effectiveness of our method.

Related papers

DynFaceRestore: Balancing Fidelity and Quality in Diffusion-Guided Blind Face Restoration with Dynamic Blur-Level Mapping and Guidance [23.898755072744425]
Blind Face Restoration aims to recover high-fidelity, detail-rich facial images from unknown degraded inputs.<n>We propose DynFaceRestore, a novel blind face restoration approach that learns to map any blindly degraded input to blurry images.<n>DynFaceRestore achieves state-of-the-art performance in both quantitative and qualitative evaluations.
arXiv Detail & Related papers (2025-07-18T10:16:08Z)
DicFace: Dirichlet-Constrained Variational Codebook Learning for Temporally Coherent Video Face Restoration [24.004683996460685]
Video face restoration faces a critical challenge in maintaining temporal consistency while recovering facial details from degraded inputs.<n>This paper presents a novel approach that extends Vector-Quantized Variational Autoencoders (VQ-VAEs), pretrained on static high-quality images, into a video restoration framework.
arXiv Detail & Related papers (2025-06-16T10:54:28Z)
LAFR: Efficient Diffusion-based Blind Face Restoration via Latent Codebook Alignment Adapter [52.93785843453579]
Blind face restoration from low-quality (LQ) images is a challenging task that requires high-fidelity image reconstruction and the preservation of facial identity.<n>We propose LAFR, a novel codebook-based latent space adapter that aligns the latent distribution of LQ images with that of HQ counterparts.<n>We show that lightweight finetuning of diffusion prior on just 0.9% of FFHQ dataset is sufficient to achieve results comparable to state-of-the-art methods.
arXiv Detail & Related papers (2025-05-29T14:11:16Z)
IQPFR: An Image Quality Prior for Blind Face Restoration and Beyond [56.99331967165238]
Blind Face Restoration (BFR) addresses the challenge of reconstructing degraded low-quality (LQ) facial images into high-quality (HQ) outputs. We propose a novel framework that incorporates an Image Quality Prior (IQP) derived from No-Reference Image Quality Assessment (NR-IQA) models. Our method outperforms state-of-the-art techniques across multiple benchmarks.
arXiv Detail & Related papers (2025-03-12T11:39:51Z)
Towards Unsupervised Blind Face Restoration using Diffusion Prior [12.69610609088771]
Blind face restoration methods have shown remarkable performance when trained on large-scale synthetic datasets with supervised learning. These datasets are often generated by simulating low-quality face images with a handcrafted image degradation pipeline. In this paper, we address this issue by using only a set of input images, with unknown degradations and without ground truth targets, to fine-tune a restoration model. Our best model also achieves the state-of-the-art results on both synthetic and real-world datasets.
arXiv Detail & Related papers (2024-10-06T20:38:14Z)
DaLPSR: Leverage Degradation-Aligned Language Prompt for Real-World Image Super-Resolution [19.33582308829547]
This paper proposes to leverage degradation-aligned language prompt for accurate, fine-grained, and high-fidelity image restoration. The proposed method achieves a new state-of-the-art perceptual quality level.
arXiv Detail & Related papers (2024-06-24T09:30:36Z)
BFRFormer: Transformer-based generator for Real-World Blind Face Restoration [37.77996097891398]
We propose a Transformer-based blind face restoration method, named BFRFormer, to reconstruct images with more identity-preserved details in an end-to-end manner. Our method outperforms state-of-the-art methods on a synthetic dataset and four real-world datasets.
arXiv Detail & Related papers (2024-02-29T02:31:54Z)
CLR-Face: Conditional Latent Refinement for Blind Face Restoration Using Score-Based Diffusion Models [57.9771859175664]
Recent generative-prior-based methods have shown promising blind face restoration performance. Generating fine-grained facial details faithful to inputs remains a challenging problem. We introduce a diffusion-based-prior inside a VQGAN architecture that focuses on learning the distribution over uncorrupted latent embeddings.
arXiv Detail & Related papers (2024-02-08T23:51:49Z)
ENTED: Enhanced Neural Texture Extraction and Distribution for Reference-based Blind Face Restoration [51.205673783866146]
We present ENTED, a new framework for blind face restoration that aims to restore high-quality and realistic portrait images. We utilize a texture extraction and distribution framework to transfer high-quality texture features between the degraded input and reference image. The StyleGAN-like architecture in our framework requires high-quality latent codes to generate realistic images.
arXiv Detail & Related papers (2024-01-13T04:54:59Z)
RestoreFormer++: Towards Real-World Blind Face Restoration from Undegraded Key-Value Pairs [63.991802204929485]
Blind face restoration aims at recovering high-quality face images from those with unknown degradations. Current algorithms mainly introduce priors to complement high-quality details and achieve impressive progress. We propose RestoreFormer++, which introduces fully-spatial attention mechanisms to model the contextual information and the interplay with the priors. We show that RestoreFormer++ outperforms state-of-the-art algorithms on both synthetic and real-world datasets.
arXiv Detail & Related papers (2023-08-14T16:04:53Z)
Exploiting Diffusion Prior for Real-World Image Super-Resolution [75.5898357277047]
We present a novel approach to leverage prior knowledge encapsulated in pre-trained text-to-image diffusion models for blind super-resolution. By employing our time-aware encoder, we can achieve promising restoration results without altering the pre-trained synthesis model.
arXiv Detail & Related papers (2023-05-11T17:55:25Z)
RIDCP: Revitalizing Real Image Dehazing via High-Quality Codebook Priors [14.432465539590481]
Existing dehazing approaches struggle to process real-world hazy images owing to the lack of paired real data and robust priors. We present a new paradigm for real image dehazing from the perspectives of synthesizing more realistic hazy data.
arXiv Detail & Related papers (2023-04-08T12:12:24Z)
High-Fidelity GAN Inversion for Image Attribute Editing [61.966946442222735]
We present a novel high-fidelity generative adversarial network (GAN) inversion framework that enables attribute editing with image-specific details well-preserved. With a low bit-rate latent code, previous works have difficulties in preserving high-fidelity details in reconstructed and edited images. We propose a distortion consultation approach that employs a distortion map as a reference for high-fidelity reconstruction.
arXiv Detail & Related papers (2021-09-14T11:23:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.