ConsisLoRA: Enhancing Content and Style Consistency for LoRA-based Style Transfer
- URL: http://arxiv.org/abs/2503.10614v1
- Date: Thu, 13 Mar 2025 17:55:58 GMT
- Title: ConsisLoRA: Enhancing Content and Style Consistency for LoRA-based Style Transfer
- Authors: Bolin Chen, Baoquan Zhao, Haoran Xie, Yi Cai, Qing Li, Xudong Mao,
- Abstract summary: Style transfer involves transferring the style from a reference image to the content of a target image.<n>Recent advancements in LoRA-based (Low-Rank Adaptation) methods have shown promise in effectively capturing the style of a single image.<n>These approaches still face significant challenges such as content inconsistency, style misalignment, and content leakage.
- Score: 20.088714830700916
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Style transfer involves transferring the style from a reference image to the content of a target image. Recent advancements in LoRA-based (Low-Rank Adaptation) methods have shown promise in effectively capturing the style of a single image. However, these approaches still face significant challenges such as content inconsistency, style misalignment, and content leakage. In this paper, we comprehensively analyze the limitations of the standard diffusion parameterization, which learns to predict noise, in the context of style transfer. To address these issues, we introduce ConsisLoRA, a LoRA-based method that enhances both content and style consistency by optimizing the LoRA weights to predict the original image rather than noise. We also propose a two-step training strategy that decouples the learning of content and style from the reference image. To effectively capture both the global structure and local details of the content image, we introduce a stepwise loss transition strategy. Additionally, we present an inference guidance method that enables continuous control over content and style strengths during inference. Through both qualitative and quantitative evaluations, our method demonstrates significant improvements in content and style consistency while effectively reducing content leakage.
Related papers
- LLM-Enabled Style and Content Regularization for Personalized Text-to-Image Generation [14.508665960053643]
The personalized text-to-image generation has rapidly advanced with the emergence of Stable Diffusion.
Existing methods, which typically fine-tune models using embedded identifiers, often struggle with insufficient stylization and inaccurate image content.
We propose style refinement and content preservation strategies.
arXiv Detail & Related papers (2025-04-19T04:08:42Z) - Balanced Image Stylization with Style Matching Score [36.542802101359705]
Style Matching Score (SMS) is a novel optimization method for image stylization with diffusion models.<n>SMS balances style alignment and content preservation, outperforming state-of-the-art approaches.
arXiv Detail & Related papers (2025-03-10T17:58:02Z) - Inversion-Free Video Style Transfer with Trajectory Reset Attention Control and Content-Style Bridging [5.501345898413532]
We introduce Trajectory Reset Attention Control (TRAC), a novel method that allows for high-quality style transfer.<n>TRAC operates by resetting the denoising trajectory and enforcing attention control, thus enhancing content consistency.<n>We present a tuning-free framework that offers a stable, flexible, and efficient solution for both image and video style transfer.
arXiv Detail & Related papers (2025-03-10T14:18:43Z) - ZePo: Zero-Shot Portrait Stylization with Faster Sampling [61.14140480095604]
This paper presents an inversion-free portrait stylization framework based on diffusion models that accomplishes content and style feature fusion in merely four sampling steps.
We propose a feature merging strategy to amalgamate redundant features in Consistency Features, thereby reducing the computational load of attention control.
arXiv Detail & Related papers (2024-08-10T08:53:41Z) - ArtWeaver: Advanced Dynamic Style Integration via Diffusion Model [73.95608242322949]
Stylized Text-to-Image Generation (STIG) aims to generate images from text prompts and style reference images.
We present ArtWeaver, a novel framework that leverages pretrained Stable Diffusion to address challenges such as misinterpreted styles and inconsistent semantics.
arXiv Detail & Related papers (2024-05-24T07:19:40Z) - DiffStyler: Diffusion-based Localized Image Style Transfer [0.0]
Image style transfer aims to imbue digital imagery with the distinctive attributes of style targets, such as colors, brushstrokes, shapes.
Despite the advancements in arbitrary style transfer methods, a prevalent challenge remains the delicate equilibrium between content semantics and style attributes.
This paper introduces DiffStyler, a novel approach that facilitates efficient and precise arbitrary image style transfer.
arXiv Detail & Related papers (2024-03-27T11:19:34Z) - Implicit Style-Content Separation using B-LoRA [61.664293840163865]
We introduce B-LoRA, a method that implicitly separate the style and content components of a single image.
By analyzing the architecture of SDXL combined with LoRA, we find that jointly learning the LoRA weights of two specific blocks achieves style-content separation.
arXiv Detail & Related papers (2024-03-21T17:20:21Z) - DiffStyler: Controllable Dual Diffusion for Text-Driven Image
Stylization [66.42741426640633]
DiffStyler is a dual diffusion processing architecture to control the balance between the content and style of diffused results.
We propose a content image-based learnable noise on which the reverse denoising process is based, enabling the stylization results to better preserve the structure information of the content image.
arXiv Detail & Related papers (2022-11-19T12:30:44Z) - Arbitrary Style Transfer via Multi-Adaptation Network [109.6765099732799]
A desired style transfer, given a content image and referenced style painting, would render the content image with the color tone and vivid stroke patterns of the style painting.
A new disentanglement loss function enables our network to extract main style patterns and exact content structures to adapt to various input images.
arXiv Detail & Related papers (2020-05-27T08:00:22Z) - Parameter-Free Style Projection for Arbitrary Style Transfer [64.06126075460722]
This paper proposes a new feature-level style transformation technique, named Style Projection, for parameter-free, fast, and effective content-style transformation.
This paper further presents a real-time feed-forward model to leverage Style Projection for arbitrary image style transfer.
arXiv Detail & Related papers (2020-03-17T13:07:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.