TransEditor: Transformer-Based Dual-Space GAN for Highly Controllable
Facial Editing
- URL: http://arxiv.org/abs/2203.17266v1
- Date: Thu, 31 Mar 2022 17:58:13 GMT
- Title: TransEditor: Transformer-Based Dual-Space GAN for Highly Controllable
Facial Editing
- Authors: Yanbo Xu, Yueqin Yin, Liming Jiang, Qianyi Wu, Chengyao Zheng, Chen
Change Loy, Bo Dai, Wayne Wu
- Abstract summary: We propose TransEditor, a novel Transformer-based framework to enhance interaction in a dual-space GAN for more controllable editing.
Experiments demonstrate the superiority of the proposed framework in image quality and editing capability, suggesting the effectiveness of TransEditor for highly controllable facial editing.
- Score: 110.82128064489237
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent advances like StyleGAN have promoted the growth of controllable facial
editing. To address its core challenge of attribute decoupling in a single
latent space, attempts have been made to adopt dual-space GAN for better
disentanglement of style and content representations. Nonetheless, these
methods are still incompetent to obtain plausible editing results with high
controllability, especially for complicated attributes. In this study, we
highlight the importance of interaction in a dual-space GAN for more
controllable editing. We propose TransEditor, a novel Transformer-based
framework to enhance such interaction. Besides, we develop a new dual-space
editing and inversion strategy to provide additional editing flexibility.
Extensive experiments demonstrate the superiority of the proposed framework in
image quality and editing capability, suggesting the effectiveness of
TransEditor for highly controllable facial editing.
Related papers
- Task-Oriented Diffusion Inversion for High-Fidelity Text-based Editing [60.730661748555214]
We introduce textbfTask-textbfOriented textbfDiffusion textbfInversion (textbfTODInv), a novel framework that inverts and edits real images tailored to specific editing tasks.
ToDInv seamlessly integrates inversion and editing through reciprocal optimization, ensuring both high fidelity and precise editability.
arXiv Detail & Related papers (2024-08-23T22:16:34Z) - The Devil is in the Details: StyleFeatureEditor for Detail-Rich StyleGAN Inversion and High Quality Image Editing [3.58736715327935]
We introduce StyleFeatureEditor, a novel method that enables editing in both w-latents and F-latents.
We also present a new training pipeline specifically designed to train our model to accurately edit F-latents.
Our method is compared with state-of-the-art encoding approaches, demonstrating that our model excels in terms of reconstruction quality.
arXiv Detail & Related papers (2024-06-15T11:28:32Z) - DiffEditor: Boosting Accuracy and Flexibility on Diffusion-based Image
Editing [66.43179841884098]
Large-scale Text-to-Image (T2I) diffusion models have revolutionized image generation over the last few years.
We propose DiffEditor to rectify two weaknesses in existing diffusion-based image editing.
Our method can efficiently achieve state-of-the-art performance on various fine-grained image editing tasks.
arXiv Detail & Related papers (2024-02-04T18:50:29Z) - Unified Diffusion-Based Rigid and Non-Rigid Editing with Text and Image
Guidance [15.130419159003816]
We present a versatile image editing framework capable of executing both rigid and non-rigid edits.
We leverage a dual-path injection scheme to handle diverse editing scenarios.
We introduce an integrated self-attention mechanism for fusion of appearance and structural information.
arXiv Detail & Related papers (2024-01-04T08:21:30Z) - HyperEditor: Achieving Both Authenticity and Cross-Domain Capability in
Image Editing via Hypernetworks [5.9189325968909365]
We propose an innovative image editing method called HyperEditor, which utilizes weight factors generated by hypernetworks to reassign the weights of the pre-trained StyleGAN2's generator.
Guided by CLIP's cross-modal image-text semantic alignment, this innovative approach enables us to simultaneously accomplish authentic attribute editing and cross-domain style transfer.
arXiv Detail & Related papers (2023-12-21T02:39:53Z) - Latent Space Editing in Transformer-Based Flow Matching [53.75073756305241]
Flow Matching with a transformer backbone offers the potential for scalable and high-quality generative modeling.
We introduce an editing space, $u$-space, that can be manipulated in a controllable, accumulative, and composable manner.
Lastly, we put forth a straightforward yet powerful method for achieving fine-grained and nuanced editing using text prompts.
arXiv Detail & Related papers (2023-12-17T21:49:59Z) - Plasticine3D: 3D Non-Rigid Editing with Text Guidance by Multi-View Embedding Optimization [21.8454418337306]
We propose Plasticine3D, a novel text-guided controlled 3D editing pipeline that can perform 3D non-rigid editing.
Our work divides the editing process into a geometry editing stage and a texture editing stage to achieve separate control of structure and appearance.
For the purpose of fine-grained control, we propose Embedding-Fusion (EF) to blend the original characteristics with the editing objectives in the embedding space.
arXiv Detail & Related papers (2023-12-15T09:01:54Z) - SHAP-EDITOR: Instruction-guided Latent 3D Editing in Seconds [73.91114735118298]
Shap-Editor is a novel feed-forward 3D editing framework.
We demonstrate that direct 3D editing in this space is possible and efficient by building a feed-forward editor network.
arXiv Detail & Related papers (2023-12-14T18:59:06Z) - StyleDiffusion: Prompt-Embedding Inversion for Text-Based Editing [86.92711729969488]
We exploit the amazing capacities of pretrained diffusion models for the editing of images.
They either finetune the model, or invert the image in the latent space of the pretrained model.
They suffer from two problems: Unsatisfying results for selected regions, and unexpected changes in nonselected regions.
arXiv Detail & Related papers (2023-03-28T00:16:45Z) - What Decreases Editing Capability? Domain-Specific Hybrid Refinement for
Improved GAN Inversion [3.9041061259639136]
Inversion methods have focused on additional high-rate information in the generator to refine inversion and editing results from embedded latent codes.
A vital crux is refining inversion results, avoiding editing capability degradation.
We introduce Domain-Specific Hybrid Refinement, which draws on the advantages and disadvantages of two mainstream refinement techniques.
arXiv Detail & Related papers (2023-01-28T09:31:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.