Related papers: S$^{2}$-DMs:Skip-Step Diffusion Models

S$^{2}$-DMs:Skip-Step Diffusion Models

URL: http://arxiv.org/abs/2401.01520v2
Date: Tue, 23 Jan 2024 02:59:04 GMT
Title: S$^{2}$-DMs:Skip-Step Diffusion Models
Authors: Yixuan Wang and Shuangyin Li
Abstract summary: Diffusion models have emerged as powerful generative tools, rivaling GANs in sample quality and mirroring the likelihood scores of autoregressive models. A subset of these models, exemplified by DDIMs, exhibit an inherent asymmetry: they are trained over $T$ steps but only sample from a subset of $T$ during generation. This selective sampling approach, though optimized for speed, inadvertently misses out on vital information from the unsampled steps, leading to potential compromises in sample quality. We present the S$2$-DMs, which is a new training method by using an innovative $L
Score: 10.269647566864247
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Diffusion models have emerged as powerful generative tools, rivaling GANs in sample quality and mirroring the likelihood scores of autoregressive models. A subset of these models, exemplified by DDIMs, exhibit an inherent asymmetry: they are trained over $T$ steps but only sample from a subset of $T$ during generation. This selective sampling approach, though optimized for speed, inadvertently misses out on vital information from the unsampled steps, leading to potential compromises in sample quality. To address this issue, we present the S$^{2}$-DMs, which is a new training method by using an innovative $L_{skip}$, meticulously designed to reintegrate the information omitted during the selective sampling phase. The benefits of this approach are manifold: it notably enhances sample quality, is exceptionally simple to implement, requires minimal code modifications, and is flexible enough to be compatible with various sampling algorithms. On the CIFAR10 dataset, models trained using our algorithm showed an improvement of 3.27% to 14.06% over models trained with traditional methods across various sampling algorithms (DDIMs, PNDMs, DEIS) and different numbers of sampling steps (10, 20, ..., 1000). On the CELEBA dataset, the improvement ranged from 8.97% to 27.08%. Access to the code and additional resources is provided in the github.

Related papers

Solving dynamic portfolio selection problems via score-based diffusion models [3.8857791305699565]
We tackle the dynamic mean-variance portfolio selection problem in a it model-free manner, based on (generative) diffusion models.<n>We propose using data sampled from the real model $mathbb P$ with limited size to train a generative model $mathbb Q$.<n>With adaptive training and sampling methods that are tailor-made for time series data, we obtain bounds between $mathbb P$ and $mathbb Q$.
arXiv Detail & Related papers (2025-07-14T04:41:49Z)
Satori-SWE: Evolutionary Test-Time Scaling for Sample-Efficient Software Engineering [51.7496756448709]
Language models (LMs) perform well on coding benchmarks but struggle with real-world software engineering tasks.<n>Existing approaches rely on supervised fine-tuning with high-quality data, which is expensive to curate at scale.<n>We propose Test-Time Scaling (EvoScale), a sample-efficient method that treats generation as an evolutionary process.
arXiv Detail & Related papers (2025-05-29T16:15:36Z)
Clear Preferences Leave Traces: Reference Model-Guided Sampling for Preference Learning [59.11519451499754]
Direct Preference Optimization (DPO) has emerged as a de-facto approach for aligning language models with human preferences. Recent work has shown DPO's effectiveness relies on training data quality. We discover that reference model probability space naturally detects high-quality training samples.
arXiv Detail & Related papers (2025-01-25T07:21:50Z)
Addressing Multilabel Imbalance with an Efficiency-Focused Approach Using Diffusion Model-Generated Synthetic Samples [2.5399059426702575]
Multilabel learning (MLL) algorithms are used to classify patterns, rank labels, or learn the distribution of outputs. The generation of new instances associated with minority labels, so that empty areas of the feature space are filled, helps to improve the obtained models. In this paper, a diffusion model tailored to produce new instances for MLL data, called MLDM, is proposed.
arXiv Detail & Related papers (2025-01-18T16:56:50Z)
One Step Diffusion via Shortcut Models [109.72495454280627]
We introduce shortcut models, a family of generative models that use a single network and training phase to produce high-quality samples. Shortcut models condition the network on the current noise level and also on the desired step size, allowing the model to skip ahead in the generation process. Compared to distillation, shortcut models reduce complexity to a single network and training phase and additionally allow varying step budgets at inference time.
arXiv Detail & Related papers (2024-10-16T13:34:40Z)
Local Flow Matching Generative Models [19.859984725284896]
Local Flow Matching is a computational framework for density estimation based on flow-based generative models. $textttLFM$ employs a simulation-free scheme and incrementally learns a sequence of Flow Matching sub-models. We demonstrate the improved training efficiency and competitive generative performance of $textttLFM$ compared to FM.
arXiv Detail & Related papers (2024-10-03T14:53:10Z)
Informed Correctors for Discrete Diffusion Models [32.87362154118195]
We propose a family of informed correctors that more reliably counteracts discretization error by leveraging information learned by the model. We also propose $k$-Gillespie's, a sampling algorithm that better utilizes each model evaluation, while still enjoying the speed and flexibility of $tau$-leaping. Across several real and synthetic datasets, we show that $k$-Gillespie's with informed correctors reliably produces higher quality samples at lower computational cost.
arXiv Detail & Related papers (2024-07-30T23:29:29Z)
A Unified Sampling Framework for Solver Searching of Diffusion Probabilistic Models [21.305868355976394]
In this paper, we propose a unified sampling framework (USF) to study the optional strategies for solver. Under this framework, we reveal that taking different solving strategies at different timesteps may help further decrease the truncation error. We demonstrate that $S3$ can find outstanding solver schedules which outperform the state-of-the-art sampling methods.
arXiv Detail & Related papers (2023-12-12T13:19:40Z)
Let's Sample Step by Step: Adaptive-Consistency for Efficient Reasoning and Coding with LLMs [60.58434523646137]
A popular approach for improving the correctness of output from large language models (LLMs) is Self-Consistency. We introduce Adaptive-Consistency, a cost-efficient, model-agnostic technique that dynamically adjusts the number of samples per question. Our experiments show that Adaptive-Consistency reduces sample budget by up to 7.9 times with an average accuracy drop of less than 0.1%.
arXiv Detail & Related papers (2023-05-19T17:49:25Z)
ScoreMix: A Scalable Augmentation Strategy for Training GANs with Limited Data [93.06336507035486]
Generative Adversarial Networks (GANs) typically suffer from overfitting when limited training data is available. We present ScoreMix, a novel and scalable data augmentation approach for various image synthesis tasks.
arXiv Detail & Related papers (2022-10-27T02:55:15Z)
Towards Automated Imbalanced Learning with Deep Hierarchical Reinforcement Learning [57.163525407022966]
Imbalanced learning is a fundamental challenge in data mining, where there is a disproportionate ratio of training samples in each class. Over-sampling is an effective technique to tackle imbalanced learning through generating synthetic samples for the minority class. We propose AutoSMOTE, an automated over-sampling algorithm that can jointly optimize different levels of decisions.
arXiv Detail & Related papers (2022-08-26T04:28:01Z)
Forgetting Data from Pre-trained GANs [28.326418377665345]
We investigate how to post-edit a model after training so that it forgets certain kinds of samples. We provide three different algorithms for GANs that differ on how the samples to be forgotten are described. Our algorithms are capable of forgetting data while retaining high generation quality at a fraction of the cost of full re-training.
arXiv Detail & Related papers (2022-06-29T03:46:16Z)
Learning Fast Samplers for Diffusion Models by Differentiating Through Sample Quality [44.37533757879762]
We introduce Differentiable Diffusion Sampler Search (DDSS), a method that optimize fast samplers for any pre-trained diffusion model. We also present Generalized Gaussian Diffusion Models (GGDM), a family of flexible non-Markovian samplers for diffusion models. Our method is compatible with any pre-trained diffusion model without fine-tuning or re-training required.
arXiv Detail & Related papers (2022-02-11T18:53:18Z)
A Provably Efficient Sample Collection Strategy for Reinforcement Learning [123.69175280309226]
One of the challenges in online reinforcement learning (RL) is that the agent needs to trade off the exploration of the environment and the exploitation of the samples to optimize its behavior. We propose to tackle the exploration-exploitation problem following a decoupled approach composed of: 1) An "objective-specific" algorithm that prescribes how many samples to collect at which states, as if it has access to a generative model (i.e., sparse simulator of the environment); 2) An "objective-agnostic" sample collection responsible for generating the prescribed samples as fast as possible.
arXiv Detail & Related papers (2020-07-13T15:17:35Z)
Learning Gaussian Graphical Models via Multiplicative Weights [54.252053139374205]
We adapt an algorithm of Klivans and Meka based on the method of multiplicative weight updates. The algorithm enjoys a sample complexity bound that is qualitatively similar to others in the literature. It has a low runtime $O(mp2)$ in the case of $m$ samples and $p$ nodes, and can trivially be implemented in an online manner.
arXiv Detail & Related papers (2020-02-20T10:50:58Z)
Learning the Stein Discrepancy for Training and Evaluating Energy-Based Models without Sampling [30.406623987492726]
We present a new method for evaluating and training unnormalized density models. We estimate the Stein discrepancy between the data density $p(x)$ and the model density $q(x)$ defined by a vector function of the data. This yields a novel goodness-of-fit test which outperforms existing methods on high dimensional data.
arXiv Detail & Related papers (2020-02-13T16:39:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.