Related papers: Style Injection in Diffusion: A Training-free Approach for Adapting Large-scale Diffusion Models for Style Transfer

Style Injection in Diffusion: A Training-free Approach for Adapting Large-scale Diffusion Models for Style Transfer

URL: http://arxiv.org/abs/2312.09008v2
Date: Wed, 20 Mar 2024 12:39:52 GMT
Title: Style Injection in Diffusion: A Training-free Approach for Adapting Large-scale Diffusion Models for Style Transfer
Authors: Jiwoo Chung, Sangeek Hyun, Jae-Pil Heo,
Abstract summary: We introduce a novel artistic style transfer method based on a pre-trained large-scale diffusion model without any optimization. Our experimental results demonstrate that our proposed method surpasses state-of-the-art methods in both conventional and diffusion-based style transfer baselines.
Score: 19.355744690301403
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Despite the impressive generative capabilities of diffusion models, existing diffusion model-based style transfer methods require inference-stage optimization (e.g. fine-tuning or textual inversion of style) which is time-consuming, or fails to leverage the generative ability of large-scale diffusion models. To address these issues, we introduce a novel artistic style transfer method based on a pre-trained large-scale diffusion model without any optimization. Specifically, we manipulate the features of self-attention layers as the way the cross-attention mechanism works; in the generation process, substituting the key and value of content with those of style image. This approach provides several desirable characteristics for style transfer including 1) preservation of content by transferring similar styles into similar image patches and 2) transfer of style based on similarity of local texture (e.g. edge) between content and style images. Furthermore, we introduce query preservation and attention temperature scaling to mitigate the issue of disruption of original content, and initial latent Adaptive Instance Normalization (AdaIN) to deal with the disharmonious color (failure to transfer the colors of style). Our experimental results demonstrate that our proposed method surpasses state-of-the-art methods in both conventional and diffusion-based style transfer baselines.

Related papers

Break Stylistic Sophon: Are We Really Meant to Confine the Imagination in Style Transfer? [12.2238770989173]
StyleWallfacer is a groundbreaking unified training and inference framework.<n>It addresses various issues encountered in the style transfer process of traditional methods.<n>It delivers artist-level style transfer and text-driven stylization.
arXiv Detail & Related papers (2025-06-18T00:24:29Z)
Training-free Stylized Text-to-Image Generation with Fast Inference [24.55785152141884]
We propose a novel stylized image generation method leveraging a pre-trained large-scale diffusion model.<n>We exploit the self-consistency property of latent consistency models to extract the representative style statistics.<n>We then introduce the norm mixture of self-attention, which enables the model to query the most relevant style patterns.
arXiv Detail & Related papers (2025-05-25T09:38:23Z)
AttenST: A Training-Free Attention-Driven Style Transfer Framework with Pre-Trained Diffusion Models [4.364797586362505]
AttenST is a training-free attention-driven style transfer framework. We propose a style-guided self-attention mechanism that conditions self-attention on the reference style. We also introduce a dual-feature cross-attention mechanism to fuse content and style features.
arXiv Detail & Related papers (2025-03-10T13:28:36Z)
ZePo: Zero-Shot Portrait Stylization with Faster Sampling [61.14140480095604]
This paper presents an inversion-free portrait stylization framework based on diffusion models that accomplishes content and style feature fusion in merely four sampling steps. We propose a feature merging strategy to amalgamate redundant features in Consistency Features, thereby reducing the computational load of attention control.
arXiv Detail & Related papers (2024-08-10T08:53:41Z)
DiffStyler: Diffusion-based Localized Image Style Transfer [0.0]
Image style transfer aims to imbue digital imagery with the distinctive attributes of style targets, such as colors, brushstrokes, shapes. Despite the advancements in arbitrary style transfer methods, a prevalent challenge remains the delicate equilibrium between content semantics and style attributes. This paper introduces DiffStyler, a novel approach that facilitates efficient and precise arbitrary image style transfer.
arXiv Detail & Related papers (2024-03-27T11:19:34Z)
Diffusion-based Human Motion Style Transfer with Semantic Guidance [23.600154466988073]
We propose a novel framework for few-shot style transfer learning based on the diffusion model. In the first stage, we pre-train a diffusion-based text-to-motion model as a generative prior. In the second stage, based on the single style example, we fine-tune the pre-trained diffusion model in a few-shot manner to make it capable of style transfer.
arXiv Detail & Related papers (2024-03-20T05:52:11Z)
Phasic Content Fusing Diffusion Model with Directional Distribution Consistency for Few-Shot Model Adaption [73.98706049140098]
We propose a novel phasic content fusing few-shot diffusion model with directional distribution consistency loss. Specifically, we design a phasic training strategy with phasic content fusion to help our model learn content and style information when t is large. Finally, we propose a cross-domain structure guidance strategy that enhances structure consistency during domain adaptation.
arXiv Detail & Related papers (2023-09-07T14:14:11Z)
Improving Diffusion-based Image Translation using Asymmetric Gradient Guidance [51.188396199083336]
We present an approach that guides the reverse process of diffusion sampling by applying asymmetric gradient guidance. Our model's adaptability allows it to be implemented with both image-fusion and latent-dif models. Experiments show that our method outperforms various state-of-the-art models in image translation tasks.
arXiv Detail & Related papers (2023-06-07T12:56:56Z)
Zero-Shot Contrastive Loss for Text-Guided Diffusion Image Style Transfer [38.957512116073616]
We propose a zero-shot contrastive loss for diffusion models that doesn't require additional fine-tuning or auxiliary networks. Our method can generate images with the same semantic content as the source image in a zero-shot manner.
arXiv Detail & Related papers (2023-03-15T13:47:02Z)
A Unified Arbitrary Style Transfer Framework via Adaptive Contrastive Learning [84.8813842101747]
Unified Contrastive Arbitrary Style Transfer (UCAST) is a novel style representation learning and transfer framework. We present an adaptive contrastive learning scheme for style transfer by introducing an input-dependent temperature. Our framework consists of three key components, i.e., a parallel contrastive learning scheme for style representation and style transfer, a domain enhancement module for effective learning of style distribution, and a generative network for style transfer.
arXiv Detail & Related papers (2023-03-09T04:35:00Z)
Diffusion-based Image Translation using Disentangled Style and Content Representation [51.188396199083336]
Diffusion-based image translation guided by semantic texts or a single target image has enabled flexible style transfer. It is often difficult to maintain the original content of the image during the reverse diffusion. We present a novel diffusion-based unsupervised image translation method using disentangled style and content representation. Our experimental results show that the proposed method outperforms state-of-the-art baseline models in both text-guided and image-guided translation tasks.
arXiv Detail & Related papers (2022-09-30T06:44:37Z)
Domain Enhanced Arbitrary Image Style Transfer via Contrastive Learning [84.8813842101747]
Contrastive Arbitrary Style Transfer (CAST) is a new style representation learning and style transfer method via contrastive learning. Our framework consists of three key components, i.e., a multi-layer style projector for style code encoding, a domain enhancement module for effective learning of style distribution, and a generative network for image style transfer.
arXiv Detail & Related papers (2022-05-19T13:11:24Z)

This list is automatically generated from the titles and abstracts of the papers in this site.