RefSTAR: Blind Facial Image Restoration with Reference Selection, Transfer, and Reconstruction
- URL: http://arxiv.org/abs/2507.10470v1
- Date: Mon, 14 Jul 2025 16:50:29 GMT
- Title: RefSTAR: Blind Facial Image Restoration with Reference Selection, Transfer, and Reconstruction
- Authors: Zhicun Yin, Junjie Chen, Ming Liu, Zhixin Wang, Fan Li, Renjing Pei, Xiaoming Li, Rynson W. H. Lau, Wangmeng Zuo,
- Abstract summary: We present a novel blind facial image restoration method that considers reference selection, transfer, and reconstruction.<n>Experiments on various backbone models demonstrate superior performance, showing better identity preservation ability and reference feature transfer quality.
- Score: 75.00967931348409
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Blind facial image restoration is highly challenging due to unknown complex degradations and the sensitivity of humans to faces. Although existing methods introduce auxiliary information from generative priors or high-quality reference images, they still struggle with identity preservation problems, mainly due to improper feature introduction on detailed textures. In this paper, we focus on effectively incorporating appropriate features from high-quality reference images, presenting a novel blind facial image restoration method that considers reference selection, transfer, and reconstruction (RefSTAR). In terms of selection, we construct a reference selection (RefSel) module. For training the RefSel module, we construct a RefSel-HQ dataset through a mask generation pipeline, which contains annotating masks for 10,000 ground truth-reference pairs. As for the transfer, due to the trivial solution in vanilla cross-attention operations, a feature fusion paradigm is designed to force the features from the reference to be integrated. Finally, we propose a reference image reconstruction mechanism that further ensures the presence of reference image features in the output image. The cycle consistency loss is also redesigned in conjunction with the mask. Extensive experiments on various backbone models demonstrate superior performance, showing better identity preservation ability and reference feature transfer quality. Source code, dataset, and pre-trained models are available at https://github.com/yinzhicun/RefSTAR.
Related papers
- Reference-Guided Identity Preserving Face Restoration [54.10295747851343]
Preserving face identity is a critical yet persistent challenge in diffusion-based image restoration.<n>This paper introduces a novel approach that maximizes reference face utility for improved face restoration and identity preservation.
arXiv Detail & Related papers (2025-05-28T02:46:34Z) - Visual Style Prompt Learning Using Diffusion Models for Blind Face Restoration [16.67947885664477]
Blind face restoration aims to recover high-quality facial images from various unidentified sources of degradation.<n>Prior knowledge-based methods, leveraging geometric priors and facial features, have led to advancements in face restoration but often fall short of capturing fine details.<n>We introduce a visual style prompt learning framework that utilizes diffusion probabilistic models to explicitly generate visual prompts.
arXiv Detail & Related papers (2024-12-30T16:05:40Z) - Refine-by-Align: Reference-Guided Artifacts Refinement through Semantic Alignment [40.112548587906005]
We present Refine-by-Align, a first-of-its-kind model that employs a diffusion-based framework to address this challenge.<n>We show that our pipeline greatly pushes the boundary of fine details in the image synthesis models.
arXiv Detail & Related papers (2024-11-30T01:26:04Z) - Overcoming False Illusions in Real-World Face Restoration with Multi-Modal Guided Diffusion Model [55.46927355649013]
We introduce a novel Multi-modal Guided Real-World Face Restoration technique.<n>MGFR can mitigate the generation of false facial attributes and identities.<n>We present the Reface-HQ dataset, comprising over 21,000 high-resolution facial images across 4800 identities.
arXiv Detail & Related papers (2024-10-05T13:46:56Z) - ENTED: Enhanced Neural Texture Extraction and Distribution for
Reference-based Blind Face Restoration [51.205673783866146]
We present ENTED, a new framework for blind face restoration that aims to restore high-quality and realistic portrait images.
We utilize a texture extraction and distribution framework to transfer high-quality texture features between the degraded input and reference image.
The StyleGAN-like architecture in our framework requires high-quality latent codes to generate realistic images.
arXiv Detail & Related papers (2024-01-13T04:54:59Z) - CoSeR: Bridging Image and Language for Cognitive Super-Resolution [74.24752388179992]
We introduce the Cognitive Super-Resolution (CoSeR) framework, empowering SR models with the capacity to comprehend low-resolution images.
We achieve this by marrying image appearance and language understanding to generate a cognitive embedding.
To further improve image fidelity, we propose a novel condition injection scheme called "All-in-Attention"
arXiv Detail & Related papers (2023-11-27T16:33:29Z) - TransRef: Multi-Scale Reference Embedding Transformer for Reference-Guided Image Inpainting [45.31389892299325]
We propose a transformer-based encoder-decoder network, named TransRef, for reference-guided image inpainting.<n>For precise utilization of the reference features for guidance, a reference-patch alignment (Ref-PA) module is proposed to align the patch features of the reference and corrupted images.<n>We construct a publicly accessible benchmark dataset containing 50K pairs of input and reference images.
arXiv Detail & Related papers (2023-06-20T13:31:33Z) - Mask Reference Image Quality Assessment [8.087355843192109]
Mask Reference IQA (MR-IQA) is a method that masks specific patches of a distorted image and supplements missing patches with the reference image patches.
Our method achieves state-of-the-art performances on the benchmark KADID-10k, LIVE and CSIQ datasets.
arXiv Detail & Related papers (2023-02-27T13:52:38Z) - Blind Face Restoration via Deep Multi-scale Component Dictionaries [75.02640809505277]
We propose a deep face dictionary network (termed as DFDNet) to guide the restoration process of degraded observations.
DFDNet generates deep dictionaries for perceptually significant face components from high-quality images.
component AdaIN is leveraged to eliminate the style diversity between the input and dictionary features.
arXiv Detail & Related papers (2020-08-02T07:02:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.