AttenST: A Training-Free Attention-Driven Style Transfer Framework with Pre-Trained Diffusion Models
- URL: http://arxiv.org/abs/2503.07307v1
- Date: Mon, 10 Mar 2025 13:28:36 GMT
- Title: AttenST: A Training-Free Attention-Driven Style Transfer Framework with Pre-Trained Diffusion Models
- Authors: Bo Huang, Wenlun Xu, Qizhuo Han, Haodong Jing, Ying Li,
- Abstract summary: AttenST is a training-free attention-driven style transfer framework.<n>We propose a style-guided self-attention mechanism that conditions self-attention on the reference style.<n>We also introduce a dual-feature cross-attention mechanism to fuse content and style features.
- Score: 4.364797586362505
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: While diffusion models have achieved remarkable progress in style transfer tasks, existing methods typically rely on fine-tuning or optimizing pre-trained models during inference, leading to high computational costs and challenges in balancing content preservation with style integration. To address these limitations, we introduce AttenST, a training-free attention-driven style transfer framework. Specifically, we propose a style-guided self-attention mechanism that conditions self-attention on the reference style by retaining the query of the content image while substituting its key and value with those from the style image, enabling effective style feature integration. To mitigate style information loss during inversion, we introduce a style-preserving inversion strategy that refines inversion accuracy through multiple resampling steps. Additionally, we propose a content-aware adaptive instance normalization, which integrates content statistics into the normalization process to optimize style fusion while mitigating the content degradation. Furthermore, we introduce a dual-feature cross-attention mechanism to fuse content and style features, ensuring a harmonious synthesis of structural fidelity and stylistic expression. Extensive experiments demonstrate that AttenST outperforms existing methods, achieving state-of-the-art performance in style transfer dataset.
Related papers
- Balanced Image Stylization with Style Matching Score [36.542802101359705]
Style Matching Score (SMS) is a novel optimization method for image stylization with diffusion models.
SMS balances style alignment and content preservation, outperforming state-of-the-art approaches.
arXiv Detail & Related papers (2025-03-10T17:58:02Z) - HSI: A Holistic Style Injector for Arbitrary Style Transfer [8.47567292281412]
Holistic Style (HSI) is a novel attention-style transformation module to deliver artistic expression of target style.
HSI performs stylization only based on global style representation that is more in line with the characteristics of style transfer.
Our method outperforms state-of-the-art approaches in both effectiveness and efficiency.
arXiv Detail & Related papers (2025-02-05T09:36:24Z) - UniVST: A Unified Framework for Training-free Localized Video Style Transfer [102.52552893495475]
This paper presents UniVST, a unified framework for localized video style transfer based on diffusion model.<n>It operates without the need for training, offering a distinct advantage over existing diffusion methods that transfer style across entire videos.
arXiv Detail & Related papers (2024-10-26T05:28:02Z) - ZePo: Zero-Shot Portrait Stylization with Faster Sampling [61.14140480095604]
This paper presents an inversion-free portrait stylization framework based on diffusion models that accomplishes content and style feature fusion in merely four sampling steps.
We propose a feature merging strategy to amalgamate redundant features in Consistency Features, thereby reducing the computational load of attention control.
arXiv Detail & Related papers (2024-08-10T08:53:41Z) - ArtWeaver: Advanced Dynamic Style Integration via Diffusion Model [73.95608242322949]
Stylized Text-to-Image Generation (STIG) aims to generate images from text prompts and style reference images.
We present ArtWeaver, a novel framework that leverages pretrained Stable Diffusion to address challenges such as misinterpreted styles and inconsistent semantics.
arXiv Detail & Related papers (2024-05-24T07:19:40Z) - HiCAST: Highly Customized Arbitrary Style Transfer with Adapter Enhanced
Diffusion Models [84.12784265734238]
The goal of Arbitrary Style Transfer (AST) is injecting the artistic features of a style reference into a given image/video.
We propose HiCAST, which is capable of explicitly customizing the stylization results according to various source of semantic clues.
A novel learning objective is leveraged for video diffusion model training, which significantly improve cross-frame temporal consistency.
arXiv Detail & Related papers (2024-01-11T12:26:23Z) - Style Injection in Diffusion: A Training-free Approach for Adapting Large-scale Diffusion Models for Style Transfer [19.355744690301403]
We introduce a novel artistic style transfer method based on a pre-trained large-scale diffusion model without any optimization.
Our experimental results demonstrate that our proposed method surpasses state-of-the-art methods in both conventional and diffusion-based style transfer baselines.
arXiv Detail & Related papers (2023-12-11T09:53:12Z) - A Unified Arbitrary Style Transfer Framework via Adaptive Contrastive
Learning [84.8813842101747]
Unified Contrastive Arbitrary Style Transfer (UCAST) is a novel style representation learning and transfer framework.
We present an adaptive contrastive learning scheme for style transfer by introducing an input-dependent temperature.
Our framework consists of three key components, i.e., a parallel contrastive learning scheme for style representation and style transfer, a domain enhancement module for effective learning of style distribution, and a generative network for style transfer.
arXiv Detail & Related papers (2023-03-09T04:35:00Z) - Parameter-Free Style Projection for Arbitrary Style Transfer [64.06126075460722]
This paper proposes a new feature-level style transformation technique, named Style Projection, for parameter-free, fast, and effective content-style transformation.
This paper further presents a real-time feed-forward model to leverage Style Projection for arbitrary image style transfer.
arXiv Detail & Related papers (2020-03-17T13:07:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.