Related papers: InstantStyle-Plus: Style Transfer with Content-Preserving in Text-to-Image Generation

InstantStyle-Plus: Style Transfer with Content-Preserving in Text-to-Image Generation

URL: http://arxiv.org/abs/2407.00788v1
Date: Sun, 30 Jun 2024 18:05:33 GMT
Title: InstantStyle-Plus: Style Transfer with Content-Preserving in Text-to-Image Generation
Authors: Haofan Wang, Peng Xing, Renyuan Huang, Hao Ai, Qixun Wang, Xu Bai,
Abstract summary: Style transfer is an inventive process designed to create an image that maintains the essence of the original while embracing the visual style of another. We introduce InstantStyle-Plus, an approach that prioritizes the integrity of the original content while seamlessly integrating the target style.
Score: 4.1177497612346
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Style transfer is an inventive process designed to create an image that maintains the essence of the original while embracing the visual style of another. Although diffusion models have demonstrated impressive generative power in personalized subject-driven or style-driven applications, existing state-of-the-art methods still encounter difficulties in achieving a seamless balance between content preservation and style enhancement. For example, amplifying the style's influence can often undermine the structural integrity of the content. To address these challenges, we deconstruct the style transfer task into three core elements: 1) Style, focusing on the image's aesthetic characteristics; 2) Spatial Structure, concerning the geometric arrangement and composition of visual elements; and 3) Semantic Content, which captures the conceptual meaning of the image. Guided by these principles, we introduce InstantStyle-Plus, an approach that prioritizes the integrity of the original content while seamlessly integrating the target style. Specifically, our method accomplishes style injection through an efficient, lightweight process, utilizing the cutting-edge InstantStyle framework. To reinforce the content preservation, we initiate the process with an inverted content latent noise and a versatile plug-and-play tile ControlNet for preserving the original image's intrinsic layout. We also incorporate a global semantic adapter to enhance the semantic content's fidelity. To safeguard against the dilution of style information, a style extractor is employed as discriminator for providing supplementary style guidance. Codes will be available at https://github.com/instantX-research/InstantStyle-Plus.

Related papers

Break Stylistic Sophon: Are We Really Meant to Confine the Imagination in Style Transfer? [12.2238770989173]
StyleWallfacer is a groundbreaking unified training and inference framework.<n>It addresses various issues encountered in the style transfer process of traditional methods.<n>It delivers artist-level style transfer and text-driven stylization.
arXiv Detail & Related papers (2025-06-18T00:24:29Z)
Only-Style: Stylistic Consistency in Image Generation without Content Leakage [21.68241134664501]
Only-Style is a method designed to mitigate content leakage in a semantically coherent manner while preserving stylistic consistency.<n>Only-Style works by localizing content leakage during inference, allowing the adaptive tuning of a parameter that controls the style alignment process.<n>Our approach demonstrates a significant improvement over state-of-the-art methods through extensive evaluation across diverse instances.
arXiv Detail & Related papers (2025-06-11T16:33:09Z)
Z-STAR+: A Zero-shot Style Transfer Method via Adjusting Style Distribution [24.88532732093652]
Style transfer presents a significant challenge, primarily centered on identifying an appropriate style representation. In contrast to existing approaches, we have discovered that latent features in vanilla diffusion models inherently contain natural style and content distributions. Our method adopts dual denoising paths to represent content and style references in latent space, subsequently guiding the content image denoising process with style latent codes.
arXiv Detail & Related papers (2024-11-28T15:56:17Z)
DiffuseST: Unleashing the Capability of the Diffusion Model for Style Transfer [13.588643982359413]
Style transfer aims to fuse the artistic representation of a style image with the structural information of a content image. Existing methods train specific networks or utilize pre-trained models to learn content and style features. We propose a novel and training-free approach for style transfer, combining textual embedding with spatial features.
arXiv Detail & Related papers (2024-10-19T06:42:43Z)
FAGStyle: Feature Augmentation on Geodesic Surface for Zero-shot Text-guided Diffusion Image Style Transfer [2.3293561091456283]
The goal of image style transfer is to render an image guided by a style reference while maintaining the original content. We introduce FAGStyle, a zero-shot text-guided diffusion image style transfer method. Our approach enhances inter-patch information interaction by incorporating the Sliding Window Crop technique.
arXiv Detail & Related papers (2024-08-20T04:20:11Z)
ZePo: Zero-Shot Portrait Stylization with Faster Sampling [61.14140480095604]
This paper presents an inversion-free portrait stylization framework based on diffusion models that accomplishes content and style feature fusion in merely four sampling steps. We propose a feature merging strategy to amalgamate redundant features in Consistency Features, thereby reducing the computational load of attention control.
arXiv Detail & Related papers (2024-08-10T08:53:41Z)
ArtWeaver: Advanced Dynamic Style Integration via Diffusion Model [73.95608242322949]
Stylized Text-to-Image Generation (STIG) aims to generate images from text prompts and style reference images. We present ArtWeaver, a novel framework that leverages pretrained Stable Diffusion to address challenges such as misinterpreted styles and inconsistent semantics.
arXiv Detail & Related papers (2024-05-24T07:19:40Z)
InstantStyle: Free Lunch towards Style-Preserving in Text-to-Image Generation [5.364489068722223]
The concept of style is inherently underdetermined, encompassing a multitude of elements such as color, material, atmosphere, design, and structure. Inversion-based methods are prone to style degradation, often resulting in the loss of fine-grained details. adapter-based approaches frequently require meticulous weight tuning for each reference image to achieve a balance between style intensity and text controllability.
arXiv Detail & Related papers (2024-04-03T13:34:09Z)
Style Aligned Image Generation via Shared Attention [61.121465570763085]
We introduce StyleAligned, a technique designed to establish style alignment among a series of generated images. By employing minimal attention sharing' during the diffusion process, our method maintains style consistency across images within T2I models. Our method's evaluation across diverse styles and text prompts demonstrates high-quality and fidelity.
arXiv Detail & Related papers (2023-12-04T18:55:35Z)
StyleAdapter: A Unified Stylized Image Generation Model [97.24936247688824]
StyleAdapter is a unified stylized image generation model capable of producing a variety of stylized images. It can be integrated with existing controllable synthesis methods, such as T2I-adapter and ControlNet.
arXiv Detail & Related papers (2023-09-04T19:16:46Z)
InfoStyler: Disentanglement Information Bottleneck for Artistic Style Transfer [22.29381866838179]
Artistic style transfer aims to transfer the style of an artwork to a photograph while maintaining its original overall content. We propose a novel information disentanglement method, named InfoStyler, to capture the minimal sufficient information for both content and style representations.
arXiv Detail & Related papers (2023-07-30T13:38:56Z)
StyleStegan: Leak-free Style Transfer Based on Feature Steganography [19.153040728118285]
existing style transfer methods suffer from a serious content leakage issue. We propose a leak-free style transfer method based on feature steganography. The results demonstrate that StyleStegan successfully mitigates the content leakage issue in serial and reversible style transfer tasks.
arXiv Detail & Related papers (2023-07-01T05:00:19Z)
DiffStyler: Controllable Dual Diffusion for Text-Driven Image Stylization [66.42741426640633]
DiffStyler is a dual diffusion processing architecture to control the balance between the content and style of diffused results. We propose a content image-based learnable noise on which the reverse denoising process is based, enabling the stylization results to better preserve the structure information of the content image.
arXiv Detail & Related papers (2022-11-19T12:30:44Z)
Parameter-Free Style Projection for Arbitrary Style Transfer [64.06126075460722]
This paper proposes a new feature-level style transformation technique, named Style Projection, for parameter-free, fast, and effective content-style transformation. This paper further presents a real-time feed-forward model to leverage Style Projection for arbitrary image style transfer.
arXiv Detail & Related papers (2020-03-17T13:07:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.