Semantics-Guided Object Removal for Facial Images: with Broad
Applicability and Robust Style Preservation
- URL: http://arxiv.org/abs/2209.14479v1
- Date: Thu, 29 Sep 2022 00:09:12 GMT
- Title: Semantics-Guided Object Removal for Facial Images: with Broad
Applicability and Robust Style Preservation
- Authors: Jookyung Song, Yeonjin Chang, Seonguk Park, Nojun Kwak
- Abstract summary: Object removal and image inpainting in facial images is a task in which objects that occlude a facial image are specifically targeted, removed, and replaced by a properly reconstructed facial image.
Two different approaches utilizing U-net and modulated generator respectively have been widely endorsed for this task for their unique advantages but notwithstanding each method's innate disadvantages.
Here, we propose Semantics-Guided Inpainting Network (SGIN) which itself is a modification of the modulated generator, aiming to take advantage of its advanced generative capability and preserve the high-fidelity details of the original image.
- Score: 29.162655333387452
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Object removal and image inpainting in facial images is a task in which
objects that occlude a facial image are specifically targeted, removed, and
replaced by a properly reconstructed facial image. Two different approaches
utilizing U-net and modulated generator respectively have been widely endorsed
for this task for their unique advantages but notwithstanding each method's
innate disadvantages. U-net, a conventional approach for conditional GANs,
retains fine details of unmasked regions but the style of the reconstructed
image is inconsistent with the rest of the original image and only works
robustly when the size of the occluding object is small enough. In contrast,
the modulated generative approach can deal with a larger occluded area in an
image and provides {a} more consistent style, yet it usually misses out on most
of the detailed features. This trade-off between these two models necessitates
an invention of a model that can be applied to any size of mask while
maintaining a consistent style and preserving minute details of facial
features. Here, we propose Semantics-Guided Inpainting Network (SGIN) which
itself is a modification of the modulated generator, aiming to take advantage
of its advanced generative capability and preserve the high-fidelity details of
the original image. By using the guidance of a semantic map, our model is
capable of manipulating facial features which grants direction to the
one-to-many problem for further practicability.
Related papers
- ID-Guard: A Universal Framework for Combating Facial Manipulation via Breaking Identification [60.73617868629575]
misuse of deep learning-based facial manipulation poses a potential threat to civil rights.
To prevent this fraud at its source, proactive defense technology was proposed to disrupt the manipulation process.
We propose a novel universal framework for combating facial manipulation, called ID-Guard.
arXiv Detail & Related papers (2024-09-20T09:30:08Z) - Realistic and Efficient Face Swapping: A Unified Approach with Diffusion Models [69.50286698375386]
We propose a novel approach that better harnesses diffusion models for face-swapping.
We introduce a mask shuffling technique during inpainting training, which allows us to create a so-called universal model for swapping.
Ours is a relatively unified approach and so it is resilient to errors in other off-the-shelf models.
arXiv Detail & Related papers (2024-09-11T13:43:53Z) - Obtaining Favorable Layouts for Multiple Object Generation [50.616875565173274]
Large-scale text-to-image models can generate high-quality and diverse images based on textual prompts.
However, the existing state-of-the-art diffusion models face difficulty when generating images that involve multiple subjects.
We propose a novel approach based on a guiding principle. We allow the diffusion model to initially propose a layout, and then we rearrange the layout grid.
This is achieved by enforcing cross-attention maps (XAMs) to adhere to proposed masks and by migrating pixels from latent maps to new locations determined by us.
arXiv Detail & Related papers (2024-05-01T18:07:48Z) - When StyleGAN Meets Stable Diffusion: a $\mathscr{W}_+$ Adapter for
Personalized Image Generation [60.305112612629465]
Text-to-image diffusion models have excelled in producing diverse, high-quality, and photo-realistic images.
We present a novel use of the extended StyleGAN embedding space $mathcalW_+$ to achieve enhanced identity preservation and disentanglement for diffusion models.
Our method adeptly generates personalized text-to-image outputs that are not only compatible with prompt descriptions but also amenable to common StyleGAN editing directions.
arXiv Detail & Related papers (2023-11-29T09:05:14Z) - DIFAI: Diverse Facial Inpainting using StyleGAN Inversion [18.400846952014188]
We propose a novel framework for diverse facial inpainting exploiting the embedding space of StyleGAN.
Our framework employs pSp encoder and SeFa algorithm to identify semantic components of the StyleGAN embeddings and feed them into our proposed SPARN decoder.
arXiv Detail & Related papers (2023-01-20T06:51:34Z) - Pose with Style: Detail-Preserving Pose-Guided Image Synthesis with
Conditional StyleGAN [88.62422914645066]
We present an algorithm for re-rendering a person from a single image under arbitrary poses.
Existing methods often have difficulties in hallucinating occluded contents photo-realistically while preserving the identity and fine details in the source image.
We show that our method compares favorably against the state-of-the-art algorithms in both quantitative evaluation and visual comparison.
arXiv Detail & Related papers (2021-09-13T17:59:33Z) - One-shot domain adaptation for semantic face editing of real world
images using StyleALAE [7.541747299649292]
styleALAE is a latent-space based autoencoder that can generate photo-realistic images of high quality.
Our work ensures that the identity of the reconstructed image is the same as the given input image.
We further generate semantic modifications over the reconstructed image by using the latent space of the pre-trained styleALAE model.
arXiv Detail & Related papers (2021-08-31T14:32:18Z) - Learning to Aggregate and Personalize 3D Face from In-the-Wild Photo
Collection [65.92058628082322]
Non-parametric face modeling aims to reconstruct 3D face only from images without shape assumptions.
This paper presents a novel Learning to Aggregate and Personalize framework for unsupervised robust 3D face modeling.
arXiv Detail & Related papers (2021-06-15T03:10:17Z) - S2FGAN: Semantically Aware Interactive Sketch-to-Face Translation [11.724779328025589]
This paper proposes a sketch-to-image generation framework called S2FGAN.
We employ two latent spaces to control the face appearance and adjust the desired attributes of the generated face.
Our method successfully outperforms state-of-the-art methods on attribute manipulation by exploiting greater control of attribute intensity.
arXiv Detail & Related papers (2020-11-30T13:42:39Z) - Enhanced Residual Networks for Context-based Image Outpainting [0.0]
Deep models struggle to understand context and extrapolation through retained information.
Current models use generative adversarial networks to generate results which lack localized image feature consistency and appear fake.
We propose two methods to improve this issue: the use of a local and global discriminator, and the addition of residual blocks within the encoding section of the network.
arXiv Detail & Related papers (2020-05-14T05:14:26Z) - Domain Embedded Multi-model Generative Adversarial Networks for
Image-based Face Inpainting [44.598234654270584]
We present a domain embedded multi-model generative adversarial model for inpainting of face images with large cropped regions.
Experiments on both CelebA and CelebA-HQ face datasets demonstrate that our proposed approach achieved state-of-the-art performance.
arXiv Detail & Related papers (2020-02-05T17:36:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.