A Reference-Based 3D Semantic-Aware Framework for Accurate Local Facial Attribute Editing
- URL: http://arxiv.org/abs/2407.18392v2
- Date: Mon, 29 Jul 2024 03:09:02 GMT
- Title: A Reference-Based 3D Semantic-Aware Framework for Accurate Local Facial Attribute Editing
- Authors: Yu-Kai Huang, Yutong Zheng, Yen-Shuo Su, Anudeepsekhar Bolimera, Han Zhang, Fangyi Chen, Marios Savvides,
- Abstract summary: We introduce a novel framework that merges latent-based and reference-based editing methods.
Our approach employs a 3D GAN inversion technique to embed attributes from the reference image into a tri-plane space.
A coarse-to-fine inpainting strategy is then applied to preserve the integrity of untargeted areas.
- Score: 19.21301510545666
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Facial attribute editing plays a crucial role in synthesizing realistic faces with specific characteristics while maintaining realistic appearances. Despite advancements, challenges persist in achieving precise, 3D-aware attribute modifications, which are crucial for consistent and accurate representations of faces from different angles. Current methods struggle with semantic entanglement and lack effective guidance for incorporating attributes while maintaining image integrity. To address these issues, we introduce a novel framework that merges the strengths of latent-based and reference-based editing methods. Our approach employs a 3D GAN inversion technique to embed attributes from the reference image into a tri-plane space, ensuring 3D consistency and realistic viewing from multiple perspectives. We utilize blending techniques and predicted semantic masks to locate precise edit regions, merging them with the contextual guidance from the reference image. A coarse-to-fine inpainting strategy is then applied to preserve the integrity of untargeted areas, significantly enhancing realism. Our evaluations demonstrate superior performance across diverse editing tasks, validating our framework's effectiveness in realistic and applicable facial attribute editing.
Related papers
- Revealing Directions for Text-guided 3D Face Editing [52.85632020601518]
3D face editing is a significant task in multimedia, aimed at the manipulation of 3D face models across various control signals.
We present Face Clan, a text-general approach for generating and manipulating 3D faces based on arbitrary attribute descriptions.
Our method offers a precisely controllable manipulation method, allowing users to intuitively customize regions of interest with the text description.
arXiv Detail & Related papers (2024-10-07T12:04:39Z) - Efficient 3D-Aware Facial Image Editing via Attribute-Specific Prompt Learning [40.6806832534633]
We propose an efficient, plug-and-play, 3D-aware face editing framework based on attribute-specific prompt learning.
Our proposed framework generates high-quality images with 3D awareness and view consistency while maintaining attribute-specific features.
arXiv Detail & Related papers (2024-06-06T18:01:30Z) - Reference-Based 3D-Aware Image Editing with Triplanes [15.222454412573455]
Generative Adversarial Networks (GANs) have emerged as powerful tools for high-quality image generation and real image editing by manipulating their latent spaces.
Recent advancements in GANs include 3D-aware models such as EG3D, which feature efficient triplane-based architectures capable of reconstructing 3D geometry from single images.
This study addresses this gap by exploring and demonstrating the effectiveness of the triplane space for advanced reference-based edits.
arXiv Detail & Related papers (2024-04-04T17:53:33Z) - DiffFAE: Advancing High-fidelity One-shot Facial Appearance Editing with Space-sensitive Customization and Semantic Preservation [84.0586749616249]
This paper presents DiffFAE, a one-stage and highly-efficient diffusion-based framework tailored for high-fidelity Facial Appearance Editing.
For high-fidelity query attributes transfer, we adopt Space-sensitive Physical Customization (SPC), which ensures the fidelity and generalization ability.
In order to preserve source attributes, we introduce the Region-responsive Semantic Composition (RSC)
This module is guided to learn decoupled source-regarding features, thereby better preserving the identity and alleviating artifacts from non-facial attributes such as hair, clothes, and background.
arXiv Detail & Related papers (2024-03-26T12:53:10Z) - SERF: Fine-Grained Interactive 3D Segmentation and Editing with Radiance Fields [92.14328581392633]
We introduce a novel fine-grained interactive 3D segmentation and editing algorithm with radiance fields, which we refer to as SERF.
Our method entails creating a neural mesh representation by integrating multi-view algorithms with pre-trained 2D models.
Building upon this representation, we introduce a novel surface rendering technique that preserves local information and is robust to deformation.
arXiv Detail & Related papers (2023-12-26T02:50:42Z) - IDE-3D: Interactive Disentangled Editing for High-Resolution 3D-aware
Portrait Synthesis [38.517819699560945]
Our system consists of three major components: (1) a 3D-semantics-aware generative model that produces view-consistent, disentangled face images and semantic masks; (2) a hybrid GAN inversion approach that initializes the latent codes from the semantic and texture encoder, and further optimized them for faithful reconstruction; and (3) a canonical editor that enables efficient manipulation of semantic masks in canonical view and product high-quality editing results.
arXiv Detail & Related papers (2022-05-31T03:35:44Z) - Pixel Sampling for Style Preserving Face Pose Editing [53.14006941396712]
We present a novel two-stage approach to solve the dilemma, where the task of face pose manipulation is cast into face inpainting.
By selectively sampling pixels from the input face and slightly adjust their relative locations, the face editing result faithfully keeps the identity information as well as the image style unchanged.
With the 3D facial landmarks as guidance, our method is able to manipulate face pose in three degrees of freedom, i.e., yaw, pitch, and roll, resulting in more flexible face pose editing.
arXiv Detail & Related papers (2021-06-14T11:29:29Z) - Enjoy Your Editing: Controllable GANs for Image Editing via Latent Space
Navigation [136.53288628437355]
Controllable semantic image editing enables a user to change entire image attributes with few clicks.
Current approaches often suffer from attribute edits that are entangled, global image identity changes, and diminished photo-realism.
We propose quantitative evaluation strategies for measuring controllable editing performance, unlike prior work which primarily focuses on qualitative evaluation.
arXiv Detail & Related papers (2021-02-01T21:38:36Z) - PIE: Portrait Image Embedding for Semantic Control [82.69061225574774]
We present the first approach for embedding real portrait images in the latent space of StyleGAN.
We use StyleRig, a pretrained neural network that maps the control space of a 3D morphable face model to the latent space of the GAN.
An identity energy preservation term allows spatially coherent edits while maintaining facial integrity.
arXiv Detail & Related papers (2020-09-20T17:53:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.