HairCLIPv2: Unifying Hair Editing via Proxy Feature Blending
- URL: http://arxiv.org/abs/2310.10651v1
- Date: Mon, 16 Oct 2023 17:59:58 GMT
- Title: HairCLIPv2: Unifying Hair Editing via Proxy Feature Blending
- Authors: Tianyi Wei and Dongdong Chen and Wenbo Zhou and Jing Liao and Weiming
Zhang and Gang Hua and Nenghai Yu
- Abstract summary: HairCLIP is the first work that enables hair editing based on text descriptions or reference images.
In this paper, we propose HairCLIPv2, aiming to support all the aforementioned interactions with one unified framework.
The key idea is to convert all the hair editing tasks into hair transfer tasks, with editing conditions converted into different proxies accordingly.
- Score: 94.39417893972262
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Hair editing has made tremendous progress in recent years. Early hair editing
methods use well-drawn sketches or masks to specify the editing conditions.
Even though they can enable very fine-grained local control, such interaction
modes are inefficient for the editing conditions that can be easily specified
by language descriptions or reference images. Thanks to the recent breakthrough
of cross-modal models (e.g., CLIP), HairCLIP is the first work that enables
hair editing based on text descriptions or reference images. However, such
text-driven and reference-driven interaction modes make HairCLIP unable to
support fine-grained controls specified by sketch or mask. In this paper, we
propose HairCLIPv2, aiming to support all the aforementioned interactions with
one unified framework. Simultaneously, it improves upon HairCLIP with better
irrelevant attributes (e.g., identity, background) preservation and unseen text
descriptions support. The key idea is to convert all the hair editing tasks
into hair transfer tasks, with editing conditions converted into different
proxies accordingly. The editing effects are added upon the input image by
blending the corresponding proxy features within the hairstyle or hair color
feature spaces. Besides the unprecedented user interaction mode support,
quantitative and qualitative experiments demonstrate the superiority of
HairCLIPv2 in terms of editing effects, irrelevant attribute preservation and
visual naturalness. Our code is available at
\url{https://github.com/wty-ustc/HairCLIPv2}.
Related papers
- HairDiffusion: Vivid Multi-Colored Hair Editing via Latent Diffusion [43.3852998245422]
We introduce Multi-stage Hairstyle Blend (MHB), effectively separating control of hair color and hairstyle in diffusion latent space.
We also train a warping module to align the hair color with the target region.
Our method not only tackles the complexity of multi-color hairstyles but also addresses the challenge of preserving original colors.
arXiv Detail & Related papers (2024-10-29T06:51:52Z) - An Item is Worth a Prompt: Versatile Image Editing with Disentangled Control [21.624984690721842]
D-Edit is a framework to disentangle the comprehensive image-prompt interaction into several item-prompt interactions.
It is based on pretrained diffusion models with cross-attention layers disentangled and adopts a two-step optimization to build item-prompt associations.
We demonstrate state-of-the-art results in four types of editing operations including image-based, text-based, mask-based editing, and item removal.
arXiv Detail & Related papers (2024-03-07T20:06:29Z) - Optimisation-Based Multi-Modal Semantic Image Editing [58.496064583110694]
We propose an inference-time editing optimisation to accommodate multiple editing instruction types.
By allowing to adjust the influence of each loss function, we build a flexible editing solution that can be adjusted to user preferences.
We evaluate our method using text, pose and scribble edit conditions, and highlight our ability to achieve complex edits.
arXiv Detail & Related papers (2023-11-28T15:31:11Z) - DE-Net: Dynamic Text-guided Image Editing Adversarial Networks [82.67199573030513]
We propose a Dynamic Editing Block (DEBlock) which combines spatial- and channel-wise manipulations dynamically for various editing requirements.
Our DE-Net achieves excellent performance and manipulates source images more effectively and accurately.
arXiv Detail & Related papers (2022-06-02T17:20:52Z) - HairCLIP: Design Your Hair by Text and Reference Image [100.85116679883724]
This paper proposes a new hair editing interaction mode, which enables manipulating hair attributes individually or jointly.
We encode the image and text conditions in a shared embedding space and propose a unified hair editing framework.
With the carefully designed network structures and loss functions, our framework can perform high-quality hair editing.
arXiv Detail & Related papers (2021-12-09T18:59:58Z) - Talk-to-Edit: Fine-Grained Facial Editing via Dialog [79.8726256912376]
Talk-to-Edit is an interactive facial editing framework that performs fine-grained attribute manipulation through dialog between the user and the system.
Our key insight is to model a continual "semantic field" in the GAN latent space.
Our system generates language feedback by considering both the user request and the current state of the semantic field.
arXiv Detail & Related papers (2021-09-09T17:17:59Z) - MichiGAN: Multi-Input-Conditioned Hair Image Generation for Portrait
Editing [122.82964863607938]
MichiGAN is a novel conditional image generation method for interactive portrait hair manipulation.
We provide user control over every major hair visual factor, including shape, structure, appearance, and background.
We also build an interactive portrait hair editing system that enables straightforward manipulation of hair by projecting intuitive and high-level user inputs.
arXiv Detail & Related papers (2020-10-30T17:59:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.