HeadsUp! High-Fidelity Portrait Image Super-Resolution
- URL: http://arxiv.org/abs/2510.09924v1
- Date: Fri, 10 Oct 2025 23:48:50 GMT
- Title: HeadsUp! High-Fidelity Portrait Image Super-Resolution
- Authors: Renjie Li, Zihao Zhu, Xiaoyu Wang, Zhengzhong Tu,
- Abstract summary: We study the portrait image supersolution (PortraitISR) problem, and propose HeadsUp, a single-step diffusion model.<n>Specifically, we build our model on top of a single-step diffusion model and develop a face supervision mechanism.<n>We then integrate a reference-based mechanism to help with identity restoration, reducing face ambiguity in low-quality face restoration.
- Score: 25.264194345148365
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Portrait pictures, which typically feature both human subjects and natural backgrounds, are one of the most prevalent forms of photography on social media. Existing image super-resolution (ISR) techniques generally focus either on generic real-world images or strictly aligned facial images (i.e., face super-resolution). In practice, separate models are blended to handle portrait photos: the face specialist model handles the face region, and the general model processes the rest. However, these blending approaches inevitably introduce blending or boundary artifacts around the facial regions due to different model training recipes, while human perception is particularly sensitive to facial fidelity. To overcome these limitations, we study the portrait image supersolution (PortraitISR) problem, and propose HeadsUp, a single-step diffusion model that is capable of seamlessly restoring and upscaling portrait images in an end-to-end manner. Specifically, we build our model on top of a single-step diffusion model and develop a face supervision mechanism to guide the model in focusing on the facial region. We then integrate a reference-based mechanism to help with identity restoration, reducing face ambiguity in low-quality face restoration. Additionally, we have built a high-quality 4K portrait image ISR dataset dubbed PortraitSR-4K, to support model training and benchmarking for portrait images. Extensive experiments show that HeadsUp achieves state-of-the-art performance on the PortraitISR task while maintaining comparable or higher performance on both general image and aligned face datasets.
Related papers
- High-Quality 3D Head Reconstruction from Any Single Portrait Image [18.035517064261168]
We introduce a novel high-fidelity 3D head reconstruction method from a single portrait image, regardless of perspective, expression, or accessories.<n>Our method demonstrates robust performance across challenging scenarios, including side-face angles and complex accessories.
arXiv Detail & Related papers (2025-03-11T15:08:37Z) - Towards Consistent and Controllable Image Synthesis for Face Editing [18.646961062736207]
RigFace is a novel approach to control the lighting, facial expression and head pose of a portrait photo.<n>Our model achieves comparable or even superior performance in both identity preservation and photorealism compared to existing face editing models.
arXiv Detail & Related papers (2025-02-04T16:36:07Z) - AuthFace: Towards Authentic Blind Face Restoration with Face-oriented Generative Diffusion Prior [13.27748226506837]
Blind face restoration (BFR) is a fundamental and challenging problem in computer vision.
Recent research endeavors rely on facial image priors from the powerful pretrained text-to-image (T2I) diffusion models.
We propose AuthFace, which achieves highly authentic face restoration results by exploring a face-oriented generative diffusion prior.
arXiv Detail & Related papers (2024-10-13T14:56:13Z) - Single Image, Any Face: Generalisable 3D Face Generation [59.9369171926757]
We propose a novel model, Gen3D-Face, which generates 3D human faces with unconstrained single image input.<n>To the best of our knowledge, this is the first attempt and benchmark for creating photorealistic 3D human face avatars from single images.
arXiv Detail & Related papers (2024-09-25T14:56:37Z) - SPARK: Self-supervised Personalized Real-time Monocular Face Capture [6.093606972415841]
Current state of the art approaches have the ability to regress parametric 3D face models in real-time across a wide range of identities.
We propose a method for high-precision 3D face capture taking advantage of a collection of unconstrained videos of a subject as prior information.
arXiv Detail & Related papers (2024-09-12T12:30:04Z) - EFHQ: Multi-purpose ExtremePose-Face-HQ dataset [1.8194090162317431]
This work introduces a novel dataset named Extreme Pose Face High-Quality dataset (EFHQ), which includes a maximum of 450k high-quality images of faces at extreme poses.
To produce such a massive dataset, we utilize a novel and meticulous dataset processing pipeline to curate two publicly available datasets.
Our dataset can complement existing datasets on various facial-related tasks, such as facial synthesis with 2D/3D-aware GAN, diffusion-based text-to-image face generation, and face reenactment.
arXiv Detail & Related papers (2023-12-28T18:40:31Z) - AvatarMe++: Facial Shape and BRDF Inference with Photorealistic
Rendering-Aware GANs [119.23922747230193]
We introduce the first method that is able to reconstruct render-ready 3D facial geometry and BRDF from a single "in-the-wild" image.
Our method outperforms the existing arts by a significant margin and reconstructs high-resolution 3D faces from a single low-resolution image.
arXiv Detail & Related papers (2021-12-11T11:36:30Z) - Pro-UIGAN: Progressive Face Hallucination from Occluded Thumbnails [53.080403912727604]
We propose a multi-stage Progressive Upsampling and Inpainting Generative Adversarial Network, dubbed Pro-UIGAN.
It exploits facial geometry priors to replenish and upsample (8*) the occluded and tiny faces.
Pro-UIGAN achieves visually pleasing HR faces, reaching superior performance in downstream tasks.
arXiv Detail & Related papers (2021-08-02T02:29:24Z) - Learning to Aggregate and Personalize 3D Face from In-the-Wild Photo
Collection [65.92058628082322]
Non-parametric face modeling aims to reconstruct 3D face only from images without shape assumptions.
This paper presents a novel Learning to Aggregate and Personalize framework for unsupervised robust 3D face modeling.
arXiv Detail & Related papers (2021-06-15T03:10:17Z) - Joint Face Image Restoration and Frontalization for Recognition [79.78729632975744]
In real-world scenarios, many factors may harm face recognition performance, e.g., large pose, bad illumination,low resolution, blur and noise.
Previous efforts usually first restore the low-quality faces to high-quality ones and then perform face recognition.
We propose an Multi-Degradation Face Restoration model to restore frontalized high-quality faces from the given low-quality ones.
arXiv Detail & Related papers (2021-05-12T03:52:41Z) - Face Hallucination with Finishing Touches [65.14864257585835]
We present a novel Vivid Face Hallucination Generative Adversarial Network (VividGAN) for simultaneously super-resolving and frontalizing tiny non-frontal face images.
VividGAN consists of coarse-level and fine-level Face Hallucination Networks (FHnet) and two discriminators, i.e., Coarse-D and Fine-D.
Experiments demonstrate that our VividGAN achieves photo-realistic frontal HR faces, reaching superior performance in downstream tasks.
arXiv Detail & Related papers (2020-02-09T07:33:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.