Inference-Time Scaling for Diffusion Models beyond Scaling Denoising Steps
- URL: http://arxiv.org/abs/2501.09732v1
- Date: Thu, 16 Jan 2025 18:30:37 GMT
- Title: Inference-Time Scaling for Diffusion Models beyond Scaling Denoising Steps
- Authors: Nanye Ma, Shangyuan Tong, Haolin Jia, Hexiang Hu, Yu-Chuan Su, Mingda Zhang, Xuan Yang, Yandong Li, Tommi Jaakkola, Xuhui Jia, Saining Xie,
- Abstract summary: We explore the inference-time scaling behavior of diffusion models beyond increasing denoising steps.<n>We consider a search problem aimed at identifying better noises for the diffusion sampling process.<n>Our findings reveal that increasing inference-time compute leads to substantial improvements in the quality of samples generated by diffusion models.
- Score: 48.16416920913577
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Generative models have made significant impacts across various domains, largely due to their ability to scale during training by increasing data, computational resources, and model size, a phenomenon characterized by the scaling laws. Recent research has begun to explore inference-time scaling behavior in Large Language Models (LLMs), revealing how performance can further improve with additional computation during inference. Unlike LLMs, diffusion models inherently possess the flexibility to adjust inference-time computation via the number of denoising steps, although the performance gains typically flatten after a few dozen. In this work, we explore the inference-time scaling behavior of diffusion models beyond increasing denoising steps and investigate how the generation performance can further improve with increased computation. Specifically, we consider a search problem aimed at identifying better noises for the diffusion sampling process. We structure the design space along two axes: the verifiers used to provide feedback, and the algorithms used to find better noise candidates. Through extensive experiments on class-conditioned and text-conditioned image generation benchmarks, our findings reveal that increasing inference-time compute leads to substantial improvements in the quality of samples generated by diffusion models, and with the complicated nature of images, combinations of the components in the framework can be specifically chosen to conform with different application scenario.
Related papers
- Critical Iterative Denoising: A Discrete Generative Model Applied to Graphs [52.50288418639075]
We propose a novel framework called Iterative Denoising, which simplifies discrete diffusion and circumvents the issue by assuming conditional independence across time.
Our empirical evaluations demonstrate that the proposed method significantly outperforms existing discrete diffusion baselines in graph generation tasks.
arXiv Detail & Related papers (2025-03-27T15:08:58Z) - Inference-Time Scaling for Flow Models via Stochastic Generation and Rollover Budget Forcing [10.542645300983878]
We propose an inference-time scaling approach for pretrained flow models.
We show that SDE-based generation, particularly variance-preserving (VP) interpolant-based generation, improves particle sampling methods for inference-time scaling in flow models.
arXiv Detail & Related papers (2025-03-25T06:30:45Z) - Exploring Training and Inference Scaling Laws in Generative Retrieval [50.82554729023865]
We investigate how model size, training data scale, and inference-time compute jointly influence generative retrieval performance.
Our experiments show that n-gram-based methods demonstrate strong alignment with both training and inference scaling laws.
We find that LLaMA models consistently outperform T5 models, suggesting a particular advantage for larger decoder-only models in generative retrieval.
arXiv Detail & Related papers (2025-03-24T17:59:03Z) - Ideas in Inference-time Scaling can Benefit Generative Pre-training Algorithms [35.74919627230777]
We argue that an inference-first perspective can inspire novel generative pre-training algorithms.
We show how addressing limitations in diffusion models' inference process through targeted modifications yields a stable, single-stage algorithm.
arXiv Detail & Related papers (2025-03-10T10:27:30Z) - Spatial Reasoning with Denoising Models [49.83744014336816]
We introduce a framework to perform reasoning over sets of continuous variables via denoising generative models.
We demonstrate for the first time, that order of generation can successfully be predicted by the denoising network itself.
arXiv Detail & Related papers (2025-02-28T14:08:30Z) - Improved Noise Schedule for Diffusion Training [51.849746576387375]
We propose a novel approach to design the noise schedule for enhancing the training of diffusion models.<n>We empirically demonstrate the superiority of our noise schedule over the standard cosine schedule.
arXiv Detail & Related papers (2024-07-03T17:34:55Z) - Adv-KD: Adversarial Knowledge Distillation for Faster Diffusion Sampling [2.91204440475204]
Diffusion Probabilistic Models (DPMs) have emerged as a powerful class of deep generative models.
They rely on sequential denoising steps during sample generation.
We propose a novel method that integrates denoising phases directly into the model's architecture.
arXiv Detail & Related papers (2024-05-31T08:19:44Z) - Blue noise for diffusion models [50.99852321110366]
We introduce a novel and general class of diffusion models taking correlated noise within and across images into account.
Our framework allows introducing correlation across images within a single mini-batch to improve gradient flow.
We perform both qualitative and quantitative evaluations on a variety of datasets using our method.
arXiv Detail & Related papers (2024-02-07T14:59:25Z) - The Missing U for Efficient Diffusion Models [3.712196074875643]
Diffusion Probabilistic Models yield record-breaking performance in tasks such as image synthesis, video generation, and molecule design.
Despite their capabilities, their efficiency, especially in the reverse process, remains a challenge due to slow convergence rates and high computational costs.
We introduce an approach that leverages continuous dynamical systems to design a novel denoising network for diffusion models.
arXiv Detail & Related papers (2023-10-31T00:12:14Z) - Realistic Noise Synthesis with Diffusion Models [44.404059914652194]
Deep denoising models require extensive real-world training data, which is challenging to acquire.<n>We propose a novel Realistic Noise Synthesis Diffusor (RNSD) method using diffusion models to address these challenges.
arXiv Detail & Related papers (2023-05-23T12:56:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.