Related papers: DiSA: Diffusion Step Annealing in Autoregressive Image Generation

DiSA: Diffusion Step Annealing in Autoregressive Image Generation

URL: http://arxiv.org/abs/2505.20297v1
Date: Mon, 26 May 2025 17:59:57 GMT
Title: DiSA: Diffusion Step Annealing in Autoregressive Image Generation
Authors: Qinyu Zhao, Jaskirat Singh, Ming Xu, Akshay Asthana, Stephen Gould, Liang Zheng,
Abstract summary: An increasing number of autoregressive models, such as MAR, FlowAR, xAR, and Harmon, adopt diffusion sampling to improve the quality of image generation.<n>This paper explores how to effectively address this issue.<n>As more tokens are generated during the autoregressive process, subsequent tokens follow more constrained distributions and are easier to sample.
Score: 35.35184094233562
License: http://creativecommons.org/licenses/by/4.0/
Abstract: An increasing number of autoregressive models, such as MAR, FlowAR, xAR, and Harmon adopt diffusion sampling to improve the quality of image generation. However, this strategy leads to low inference efficiency, because it usually takes 50 to 100 steps for diffusion to sample a token. This paper explores how to effectively address this issue. Our key motivation is that as more tokens are generated during the autoregressive process, subsequent tokens follow more constrained distributions and are easier to sample. To intuitively explain, if a model has generated part of a dog, the remaining tokens must complete the dog and thus are more constrained. Empirical evidence supports our motivation: at later generation stages, the next tokens can be well predicted by a multilayer perceptron, exhibit low variance, and follow closer-to-straight-line denoising paths from noise to tokens. Based on our finding, we introduce diffusion step annealing (DiSA), a training-free method which gradually uses fewer diffusion steps as more tokens are generated, e.g., using 50 steps at the beginning and gradually decreasing to 5 steps at later stages. Because DiSA is derived from our finding specific to diffusion in autoregressive models, it is complementary to existing acceleration methods designed for diffusion alone. DiSA can be implemented in only a few lines of code on existing models, and albeit simple, achieves $5-10\times$ faster inference for MAR and Harmon and $1.4-2.5\times$ for FlowAR and xAR, while maintaining the generation quality.

Related papers

D-AR: Diffusion via Autoregressive Models [21.03363985989625]
Diffusion via Autoregressive models (D-AR) is a new paradigm recasting the image diffusion process as a vanilla autoregressive procedure.<n>Our method achieves 2.09 FID using a 775M Llama backbone with 256 discrete tokens.
arXiv Detail & Related papers (2025-05-29T17:09:25Z)
ProReflow: Progressive Reflow with Decomposed Velocity [52.249464542399636]
Flow matching aims to reflow the diffusion process of diffusion models into a straight line for a few-step and even one-step generation.<n>We introduce progressive reflow, which progressively reflows the diffusion models in local timesteps until the whole diffusion progresses.<n>We also introduce aligned v-prediction, which highlights the importance of direction matching in flow matching over magnitude matching.
arXiv Detail & Related papers (2025-03-05T04:50:53Z)
Distributional Diffusion Models with Scoring Rules [83.38210785728994]
Diffusion models generate high-quality synthetic data.<n> generating high-quality outputs requires many discretization steps.<n>We propose to accomplish sample generation by learning the posterior em distribution of clean data samples.
arXiv Detail & Related papers (2025-02-04T16:59:03Z)
Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion [61.03681839276652]
Diffusion Forcing is a new training paradigm where a diffusion model is trained to denoise a set of tokens with independent per-token noise levels.<n>We apply Diffusion Forcing to sequence generative modeling by training a causal next-token prediction model to generate one or several future tokens.
arXiv Detail & Related papers (2024-07-01T15:43:25Z)
Accelerating Parallel Sampling of Diffusion Models [25.347710690711562]
We propose a novel approach that accelerates the sampling of diffusion models by parallelizing the autoregressive process. Applying these techniques, we introduce ParaTAA, a universal and training-free parallel sampling algorithm. Our experiments demonstrate that ParaTAA can decrease the inference steps required by common sequential sampling algorithms by a factor of 4$sim$14 times.
arXiv Detail & Related papers (2024-02-15T14:27:58Z)
Towards More Accurate Diffusion Model Acceleration with A Timestep Aligner [84.97253871387028]
A diffusion model, which is formulated to produce an image using thousands of denoising steps, usually suffers from a slow inference speed. We propose a timestep aligner that helps find a more accurate integral direction for a particular interval at the minimum cost. Experiments show that our plug-in design can be trained efficiently and boost the inference performance of various state-of-the-art acceleration methods.
arXiv Detail & Related papers (2023-10-14T02:19:07Z)
UDPM: Upsampling Diffusion Probabilistic Models [33.51145642279836]
Denoising Diffusion Probabilistic Models (DDPM) have recently gained significant attention. DDPMs generate high-quality samples from complex data distributions by defining an inverse process. Unlike generative adversarial networks (GANs), the latent space of diffusion models is less interpretable. In this work, we propose to generalize the denoising diffusion process into an Upsampling Diffusion Probabilistic Model (UDPM)
arXiv Detail & Related papers (2023-05-25T17:25:14Z)
Pseudo Numerical Methods for Diffusion Models on Manifolds [77.40343577960712]
Denoising Diffusion Probabilistic Models (DDPMs) can generate high-quality samples such as image and audio samples. DDPMs require hundreds to thousands of iterations to produce final samples. We propose pseudo numerical methods for diffusion models (PNDMs) PNDMs can generate higher quality synthetic images with only 50 steps compared with 1000-step DDIMs (20x speedup)
arXiv Detail & Related papers (2022-02-20T10:37:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.