A Training-Free Style-aligned Image Generation with Scale-wise Autoregressive Model
- URL: http://arxiv.org/abs/2504.06144v1
- Date: Tue, 08 Apr 2025 15:39:25 GMT
- Title: A Training-Free Style-aligned Image Generation with Scale-wise Autoregressive Model
- Authors: Jihun Park, Jongmin Gim, Kyoungmin Lee, Minseok Oh, Minwoo Choi, Jaeyeul Kim, Woo Chool Park, Sunghoon Im,
- Abstract summary: We present a training-free style-aligned image generation method that leverages a scale-wise autoregressive model.<n>We show that our method generation quality comparable to competing approaches, significantly improves style alignment, and delivers inference speeds over six times faster than the fastest model.
- Score: 11.426771898890998
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a training-free style-aligned image generation method that leverages a scale-wise autoregressive model. While large-scale text-to-image (T2I) models, particularly diffusion-based methods, have demonstrated impressive generation quality, they often suffer from style misalignment across generated image sets and slow inference speeds, limiting their practical usability. To address these issues, we propose three key components: initial feature replacement to ensure consistent background appearance, pivotal feature interpolation to align object placement, and dynamic style injection, which reinforces style consistency using a schedule function. Unlike previous methods requiring fine-tuning or additional training, our approach maintains fast inference while preserving individual content details. Extensive experiments show that our method achieves generation quality comparable to competing approaches, significantly improves style alignment, and delivers inference speeds over six times faster than the fastest model.
Related papers
- ZePo: Zero-Shot Portrait Stylization with Faster Sampling [61.14140480095604]
This paper presents an inversion-free portrait stylization framework based on diffusion models that accomplishes content and style feature fusion in merely four sampling steps.
We propose a feature merging strategy to amalgamate redundant features in Consistency Features, thereby reducing the computational load of attention control.
arXiv Detail & Related papers (2024-08-10T08:53:41Z) - Ada-adapter:Fast Few-shot Style Personlization of Diffusion Model with Pre-trained Image Encoder [57.574544285878794]
Ada-Adapter is a novel framework for few-shot style personalization of diffusion models.
Our method enables efficient zero-shot style transfer utilizing a single reference image.
We demonstrate the effectiveness of our approach on various artistic styles, including flat art, 3D rendering, and logo design.
arXiv Detail & Related papers (2024-07-08T02:00:17Z) - InstantStyle: Free Lunch towards Style-Preserving in Text-to-Image Generation [5.364489068722223]
The concept of style is inherently underdetermined, encompassing a multitude of elements such as color, material, atmosphere, design, and structure.
Inversion-based methods are prone to style degradation, often resulting in the loss of fine-grained details.
adapter-based approaches frequently require meticulous weight tuning for each reference image to achieve a balance between style intensity and text controllability.
arXiv Detail & Related papers (2024-04-03T13:34:09Z) - Direct Consistency Optimization for Robust Customization of Text-to-Image Diffusion Models [67.68871360210208]
Text-to-image (T2I) diffusion models, when fine-tuned on a few personal images, can generate visuals with a high degree of consistency.<n>We propose a novel fine-tuning objective, dubbed Direct Consistency Optimization, which controls the deviation between fine-tuning and pretrained models.<n>We show that our approach achieves better prompt fidelity and subject fidelity than those post-optimized for merging regular fine-tuned models.
arXiv Detail & Related papers (2024-02-19T09:52:41Z) - Style Aligned Image Generation via Shared Attention [61.121465570763085]
We introduce StyleAligned, a technique designed to establish style alignment among a series of generated images.
By employing minimal attention sharing' during the diffusion process, our method maintains style consistency across images within T2I models.
Our method's evaluation across diverse styles and text prompts demonstrates high-quality and fidelity.
arXiv Detail & Related papers (2023-12-04T18:55:35Z) - AdaDiff: Adaptive Step Selection for Fast Diffusion Models [82.78899138400435]
We introduce AdaDiff, a lightweight framework designed to learn instance-specific step usage policies.<n>AdaDiff is optimized using a policy method to maximize a carefully designed reward function.<n>We conduct experiments on three image generation and two video generation benchmarks and demonstrate that our approach achieves similar visual quality compared to the baseline.
arXiv Detail & Related papers (2023-11-24T11:20:38Z) - A Unified Arbitrary Style Transfer Framework via Adaptive Contrastive
Learning [84.8813842101747]
Unified Contrastive Arbitrary Style Transfer (UCAST) is a novel style representation learning and transfer framework.
We present an adaptive contrastive learning scheme for style transfer by introducing an input-dependent temperature.
Our framework consists of three key components, i.e., a parallel contrastive learning scheme for style representation and style transfer, a domain enhancement module for effective learning of style distribution, and a generative network for style transfer.
arXiv Detail & Related papers (2023-03-09T04:35:00Z) - Is This Loss Informative? Faster Text-to-Image Customization by Tracking
Objective Dynamics [31.15864240403093]
We study the training dynamics of popular text-to-image personalization methods, aiming to speed them up.
We propose a simple drop-in early stopping criterion that only requires computing the regular training objective on a fixed set of inputs.
Our experiments on Stable Diffusion for 48 different concepts and three personalization methods demonstrate the competitive performance of our approach.
arXiv Detail & Related papers (2023-02-09T18:49:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.