Portrait Diffusion: Training-free Face Stylization with
Chain-of-Painting
- URL: http://arxiv.org/abs/2312.02212v1
- Date: Sun, 3 Dec 2023 06:48:35 GMT
- Title: Portrait Diffusion: Training-free Face Stylization with
Chain-of-Painting
- Authors: Jin Liu, Huaibo Huang, Chao Jin, Ran He
- Abstract summary: Face stylization refers to the transformation of a face into a specific portrait style.
Current methods require the use of example-based adaptation approaches to fine-tune pre-trained generative models.
This paper proposes a training-free face stylization framework, named Portrait Diffusion.
- Score: 64.43760427752532
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Face stylization refers to the transformation of a face into a specific
portrait style. However, current methods require the use of example-based
adaptation approaches to fine-tune pre-trained generative models so that they
demand lots of time and storage space and fail to achieve detailed style
transformation. This paper proposes a training-free face stylization framework,
named Portrait Diffusion. This framework leverages off-the-shelf text-to-image
diffusion models, eliminating the need for fine-tuning specific examples.
Specifically, the content and style images are first inverted into latent
codes. Then, during image reconstruction using the corresponding latent code,
the content and style features in the attention space are delicately blended
through a modified self-attention operation called Style Attention Control.
Additionally, a Chain-of-Painting method is proposed for the gradual redrawing
of unsatisfactory areas from rough adjustments to fine-tuning. Extensive
experiments validate the effectiveness of our Portrait Diffusion method and
demonstrate the superiority of Chain-of-Painting in achieving precise face
stylization. Code will be released at
\url{https://github.com/liujin112/PortraitDiffusion}.
Related papers
- PS-StyleGAN: Illustrative Portrait Sketching using Attention-Based Style Adaptation [0.0]
Portrait sketching involves capturing identity specific attributes of a real face with abstract lines and shades.
This paper introduces textbfPortrait Sketching StyleGAN (PS-StyleGAN), a style transfer approach tailored for portrait sketch synthesis.
We leverage the semantic $W+$ latent space of StyleGAN to generate portrait sketches, allowing us to make meaningful edits, like pose and expression alterations, without compromising identity.
arXiv Detail & Related papers (2024-08-31T04:22:45Z) - ZePo: Zero-Shot Portrait Stylization with Faster Sampling [61.14140480095604]
This paper presents an inversion-free portrait stylization framework based on diffusion models that accomplishes content and style feature fusion in merely four sampling steps.
We propose a feature merging strategy to amalgamate redundant features in Consistency Features, thereby reducing the computational load of attention control.
arXiv Detail & Related papers (2024-08-10T08:53:41Z) - InstantStyle: Free Lunch towards Style-Preserving in Text-to-Image Generation [5.364489068722223]
The concept of style is inherently underdetermined, encompassing a multitude of elements such as color, material, atmosphere, design, and structure.
Inversion-based methods are prone to style degradation, often resulting in the loss of fine-grained details.
adapter-based approaches frequently require meticulous weight tuning for each reference image to achieve a balance between style intensity and text controllability.
arXiv Detail & Related papers (2024-04-03T13:34:09Z) - Style Injection in Diffusion: A Training-free Approach for Adapting Large-scale Diffusion Models for Style Transfer [19.355744690301403]
We introduce a novel artistic style transfer method based on a pre-trained large-scale diffusion model without any optimization.
Our experimental results demonstrate that our proposed method surpasses state-of-the-art methods in both conventional and diffusion-based style transfer baselines.
arXiv Detail & Related papers (2023-12-11T09:53:12Z) - Improving Diffusion-based Image Translation using Asymmetric Gradient
Guidance [51.188396199083336]
We present an approach that guides the reverse process of diffusion sampling by applying asymmetric gradient guidance.
Our model's adaptability allows it to be implemented with both image-fusion and latent-dif models.
Experiments show that our method outperforms various state-of-the-art models in image translation tasks.
arXiv Detail & Related papers (2023-06-07T12:56:56Z) - Realtime Fewshot Portrait Stylization Based On Geometric Alignment [32.224973317381426]
This paper presents a portrait stylization method designed for real-time mobile applications with limited style examples available.
Previous learning based stylization methods suffer from the geometric and semantic gaps between portrait domain and style domain.
Based on the geometric prior of human facial attributions, we propose to utilize geometric alignment to tackle this issue.
arXiv Detail & Related papers (2022-11-28T16:53:19Z) - DiffStyler: Controllable Dual Diffusion for Text-Driven Image
Stylization [66.42741426640633]
DiffStyler is a dual diffusion processing architecture to control the balance between the content and style of diffused results.
We propose a content image-based learnable noise on which the reverse denoising process is based, enabling the stylization results to better preserve the structure information of the content image.
arXiv Detail & Related papers (2022-11-19T12:30:44Z) - Learning Diverse Tone Styles for Image Retouching [73.60013618215328]
We propose to learn diverse image retouching with normalizing flow-based architectures.
A joint-training pipeline is composed of a style encoder, a conditional RetouchNet, and the image tone style normalizing flow (TSFlow) module.
Our proposed method performs favorably against state-of-the-art methods and is effective in generating diverse results.
arXiv Detail & Related papers (2022-07-12T09:49:21Z) - Domain Enhanced Arbitrary Image Style Transfer via Contrastive Learning [84.8813842101747]
Contrastive Arbitrary Style Transfer (CAST) is a new style representation learning and style transfer method via contrastive learning.
Our framework consists of three key components, i.e., a multi-layer style projector for style code encoding, a domain enhancement module for effective learning of style distribution, and a generative network for image style transfer.
arXiv Detail & Related papers (2022-05-19T13:11:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.