Related papers: Dual-Solver: A Generalized ODE Solver for Diffusion Models with Dual Prediction

Dual-Solver: A Generalized ODE Solver for Diffusion Models with Dual Prediction

URL: http://arxiv.org/abs/2603.03973v1
Date: Wed, 04 Mar 2026 12:14:52 GMT
Title: Dual-Solver: A Generalized ODE Solver for Diffusion Models with Dual Prediction
Authors: Soochul Park, Yeon Ju Lee,
Abstract summary: We introduce Dual-r, which generalizes multistep samplers through learnable parameters.<n>It retains the standard predictor-corrector structure while preserving second-order local accuracy.<n>It improves FID and CLIP scores in the low-NFE regime across backbones.
Score: 0.26856688022781555
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Diffusion models achieve state-of-the-art image quality. However, sampling is costly at inference time because it requires a large number of function evaluations (NFEs). To reduce NFEs, classical ODE numerical methods have been adopted. Yet, the choice of prediction type and integration domain leads to different sampling behaviors. To address these issues, we introduce Dual-Solver, which generalizes multistep samplers through learnable parameters that continuously (i) interpolate among prediction types, (ii) select the integration domain, and (iii) adjust the residual terms. It retains the standard predictor-corrector structure while preserving second-order local accuracy. These parameters are learned via a classification-based objective using a frozen pretrained classifier (e.g., MobileNet or CLIP). For ImageNet class-conditional generation (DiT, GM-DiT) and text-to-image generation (SANA, PixArt-$α$), Dual-Solver improves FID and CLIP scores in the low-NFE regime ($3 \le$ NFE $\le 9$) across backbones.

Related papers

Generalized Zero-Shot Learning for Point Cloud Segmentation with Evidence-Based Dynamic Calibration [12.973924671425074]
Generalized zero-shot semantic segmentation of 3D point clouds aims to classify each point into both seen and unseen classes.<n>A significant challenge with these models is their tendency to make biased predictions, often favoring the classes encountered during training.<n>We propose E3DPC-GZSL, which reduces overconfident predictions towards seen classes without relying on separate classifiers for seen and unseen data.
arXiv Detail & Related papers (2025-09-10T04:37:00Z)
Adaptive Cubic Regularized Second-Order Latent Factor Analysis Model [14.755426957558868]
High-dimensional and incompleteHDI datasets have become ubiquitous across various real-world applications.<n>We propose a two-fold approach to mitigate information instabilities.<n>The ACRS HDI demonstrate that the ALF represents higher representation than the faster advancing (SACR) models.
arXiv Detail & Related papers (2025-07-03T03:15:54Z)
Rule-based Evolving Fuzzy System for Time Series Forecasting: New Perspectives Based on Type-2 Fuzzy Sets Measures Approach [0.0]
Real-world data contain uncertainty and variations that can be correlated to external variables, known as randomness.<n>One of the existing methods to deal with this type of data is the use of the evolving Fuzzy Systems (eFSs)<n>We propose ePL-KRLS-FSM+, an enhanced class of evolving fuzzy modeling approach that combines participatory learning (PL) with fuzzy logic and data transformation into fuzzy sets (FSs)<n>This improvement allows to create and measure type-2 fuzzy sets for better handling uncertainties in the data, generating a model that can predict chaotic data with increased accuracy.
arXiv Detail & Related papers (2025-02-05T22:27:20Z)
Model Inversion Attacks Through Target-Specific Conditional Diffusion Models [54.69008212790426]
Model inversion attacks (MIAs) aim to reconstruct private images from a target classifier's training set, thereby raising privacy concerns in AI applications. Previous GAN-based MIAs tend to suffer from inferior generative fidelity due to GAN's inherent flaws and biased optimization within latent space. We propose Diffusion-based Model Inversion (Diff-MI) attacks to alleviate these issues.
arXiv Detail & Related papers (2024-07-16T06:38:49Z)
DF2: Distribution-Free Decision-Focused Learning [30.288876294435294]
Decision-focused learning (DFL) has emerged as a powerful approach for predict-then-optimize problems.<n>DFL faces three bottlenecks: model error, sample average approximation error, and approximation error.<n>We present DF2, the first decision-free learning method designed to mitigate these three bottlenecks.
arXiv Detail & Related papers (2023-08-11T00:44:46Z)
Mutual Exclusivity Training and Primitive Augmentation to Induce Compositionality [84.94877848357896]
Recent datasets expose the lack of the systematic generalization ability in standard sequence-to-sequence models. We analyze this behavior of seq2seq models and identify two contributing factors: a lack of mutual exclusivity bias and the tendency to memorize whole examples. We show substantial empirical improvements using standard sequence-to-sequence models on two widely-used compositionality datasets.
arXiv Detail & Related papers (2022-11-28T17:36:41Z)
Transformers meet Stochastic Block Models: Attention with Data-Adaptive Sparsity and Cost [53.746169882193456]
Recent works have proposed various sparse attention modules to overcome the quadratic cost of self-attention. We propose a model that resolves both problems by endowing each attention head with a mixed-membership Block Model. Our model outperforms previous efficient variants as well as the original Transformer with full attention.
arXiv Detail & Related papers (2022-10-27T15:30:52Z)
Understanding Diffusion Models: A Unified Perspective [0.0]
Diffusion models have shown incredible capabilities as generative models. We review, demystify, and unify the understanding of diffusion models across both variational and score-based perspectives.
arXiv Detail & Related papers (2022-08-25T09:55:25Z)
A new perspective on probabilistic image modeling [92.89846887298852]
We present a new probabilistic approach for image modeling capable of density estimation, sampling and tractable inference. DCGMMs can be trained end-to-end by SGD from random initial conditions, much like CNNs. We show that DCGMMs compare favorably to several recent PC and SPN models in terms of inference, classification and sampling.
arXiv Detail & Related papers (2022-03-21T14:53:57Z)
Message Passing Neural PDE Solvers [60.77761603258397]
We build a neural message passing solver, replacing allally designed components in the graph with backprop-optimized neural function approximators. We show that neural message passing solvers representationally contain some classical methods, such as finite differences, finite volumes, and WENO schemes. We validate our method on various fluid-like flow problems, demonstrating fast, stable, and accurate performance across different domain topologies, equation parameters, discretizations, etc., in 1D and 2D.
arXiv Detail & Related papers (2022-02-07T17:47:46Z)
A Distributed Optimisation Framework Combining Natural Gradient with Hessian-Free for Discriminative Sequence Training [16.83036203524611]
This paper presents a novel natural gradient and Hessian-free (NGHF) optimisation framework for neural network training. It relies on the linear conjugate gradient (CG) algorithm to combine the natural gradient (NG) method with local curvature information from Hessian-free (HF) or other second-order methods. Experiments are reported on the multi-genre broadcast data set for a range of different acoustic model types.
arXiv Detail & Related papers (2021-03-12T22:18:34Z)
Autoregressive Score Matching [113.4502004812927]
We propose autoregressive conditional score models (AR-CSM) where we parameterize the joint distribution in terms of the derivatives of univariable log-conditionals (scores) For AR-CSM models, this divergence between data and model distributions can be computed and optimized efficiently, requiring no expensive sampling or adversarial training. We show with extensive experimental results that it can be applied to density estimation on synthetic data, image generation, image denoising, and training latent variable models with implicit encoders.
arXiv Detail & Related papers (2020-10-24T07:01:24Z)
Set Based Stochastic Subsampling [85.5331107565578]
We propose a set-based two-stage end-to-end neural subsampling model that is jointly optimized with an textitarbitrary downstream task network. We show that it outperforms the relevant baselines under low subsampling rates on a variety of tasks including image classification, image reconstruction, function reconstruction and few-shot classification.
arXiv Detail & Related papers (2020-06-25T07:36:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.