Revisiting Diffusion Model Predictions Through Dimensionality
- URL: http://arxiv.org/abs/2601.21419v1
- Date: Thu, 29 Jan 2026 08:56:55 GMT
- Title: Revisiting Diffusion Model Predictions Through Dimensionality
- Authors: Qing Jin, Chaoyang Wang,
- Abstract summary: Recent advances in diffusion and flow matching models have highlighted a shift in the preferred prediction target.<n>We provide a theoretical framework based on a generalized prediction formulation that accommodates arbitrary output targets.<n>We propose k-Diff, a framework that employs a data-driven approach to learn the optimal prediction parameter k directly from data.
- Score: 6.277362418411825
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Recent advances in diffusion and flow matching models have highlighted a shift in the preferred prediction target -- moving from noise ($\varepsilon$) and velocity (v) to direct data (x) prediction -- particularly in high-dimensional settings. However, a formal explanation of why the optimal target depends on the specific properties of the data remains elusive. In this work, we provide a theoretical framework based on a generalized prediction formulation that accommodates arbitrary output targets, of which $\varepsilon$-, v-, and x-prediction are special cases. We derive the analytical relationship between data's geometry and the optimal prediction target, offering a rigorous justification for why x-prediction becomes superior when the ambient dimension significantly exceeds the data's intrinsic dimension. Furthermore, while our theory identifies dimensionality as the governing factor for the optimal prediction target, the intrinsic dimension of manifold-bound data is typically intractable to estimate in practice. To bridge this gap, we propose k-Diff, a framework that employs a data-driven approach to learn the optimal prediction parameter k directly from data, bypassing the need for explicit dimension estimation. Extensive experiments in both latent-space and pixel-space image generation demonstrate that k-Diff consistently outperforms fixed-target baselines across varying architectures and data scales, providing a principled and automated approach to enhancing generative performance.
Related papers
- Supervised Dynamic Dimension Reduction with Deep Neural Network [3.0040661953201475]
We propose a novel Supervised Deep Dynamic Principal component analysis framework.<n>We construct target-aware predictors by scaling the original predictors in a supervised manner.<n>A principal component analysis is then performed on the target-aware predictors to extract the estimated SDDP factors.
arXiv Detail & Related papers (2025-08-05T15:15:30Z) - Data-Driven Forecasting of High-Dimensional Transient and Stationary Processes via Space-Time Projection [0.0]
Space-Time Projection (STP) is introduced as a data-driven forecasting approach for high-dimensional and time-resolved data.<n>The method computes extended space-time proper modes from training data spanning a prediction horizon comprising both hindcast and forecast intervals.
arXiv Detail & Related papers (2025-03-31T03:36:59Z) - Deep Partially Linear Transformation Model for Right-Censored Survival Data [6.315323176162257]
This paper introduces a deep partially linear transformation model (DPLTM) as a general and flexible regression framework.<n>The proposed method is capable of avoiding the curse of dimensionality while still retaining the interpretability of some cocensors of interest.<n> Comprehensive simulation studies demonstrate the impressive performance of the proposed procedure in terms of both accuracy and the predictive power.
arXiv Detail & Related papers (2024-12-10T15:50:43Z) - Influence Functions for Scalable Data Attribution in Diffusion Models [52.92223039302037]
Diffusion models have led to significant advancements in generative modelling.<n>Yet their widespread adoption poses challenges regarding data attribution and interpretability.<n>We develop an influence functions framework to address these challenges.
arXiv Detail & Related papers (2024-10-17T17:59:02Z) - Quasi-Bayes meets Vines [2.3124143670964448]
We propose a different way to extend Quasi-Bayesian prediction to high dimensions through the use of Sklar's theorem.
We show that our proposed Quasi-Bayesian Vine (QB-Vine) is a fully non-parametric density estimator with emphan analytical form.
arXiv Detail & Related papers (2024-06-18T16:31:02Z) - Exploiting Diffusion Prior for Generalizable Dense Prediction [85.4563592053464]
Recent advanced Text-to-Image (T2I) diffusion models are sometimes too imaginative for existing off-the-shelf dense predictors to estimate.
We introduce DMP, a pipeline utilizing pre-trained T2I models as a prior for dense prediction tasks.
Despite limited-domain training data, the approach yields faithful estimations for arbitrary images, surpassing existing state-of-the-art algorithms.
arXiv Detail & Related papers (2023-11-30T18:59:44Z) - Variational Inference with Coverage Guarantees in Simulation-Based Inference [18.818573945984873]
We propose Conformalized Amortized Neural Variational Inference (CANVI)
CANVI constructs conformalized predictors based on each candidate, compares the predictors using a metric known as predictive efficiency, and returns the most efficient predictor.
We prove lower bounds on the predictive efficiency of the regions produced by CANVI and explore how the quality of a posterior approximation relates to the predictive efficiency of prediction regions based on that approximation.
arXiv Detail & Related papers (2023-05-23T17:24:04Z) - Prediction-Oriented Bayesian Active Learning [51.426960808684655]
Expected predictive information gain (EPIG) is an acquisition function that measures information gain in the space of predictions rather than parameters.
EPIG leads to stronger predictive performance compared with BALD across a range of datasets and models.
arXiv Detail & Related papers (2023-04-17T10:59:57Z) - Uncertainty-guided Source-free Domain Adaptation [77.3844160723014]
Source-free domain adaptation (SFDA) aims to adapt a classifier to an unlabelled target data set by only using a pre-trained source model.
We propose quantifying the uncertainty in the source model predictions and utilizing it to guide the target adaptation.
arXiv Detail & Related papers (2022-08-16T08:03:30Z) - CovarianceNet: Conditional Generative Model for Correct Covariance
Prediction in Human Motion Prediction [71.31516599226606]
We present a new method to correctly predict the uncertainty associated with the predicted distribution of future trajectories.
Our approach, CovariaceNet, is based on a Conditional Generative Model with Gaussian latent variables.
arXiv Detail & Related papers (2021-09-07T09:38:24Z) - Which Invariance Should We Transfer? A Causal Minimax Learning Approach [18.71316951734806]
We present a comprehensive minimax analysis from a causal perspective.
We propose an efficient algorithm to search for the subset with minimal worst-case risk.
The effectiveness and efficiency of our methods are demonstrated on synthetic data and the diagnosis of Alzheimer's disease.
arXiv Detail & Related papers (2021-07-05T09:07:29Z) - DeepKriging: Spatially Dependent Deep Neural Networks for Spatial
Prediction [2.219504240642369]
In spatial statistics, a common objective is to predict values of a spatial process at unobserved locations by exploiting spatial dependence.
DeepKriging method has a direct link to Kriging in the Gaussian case, and it has multiple advantages over Kriging for non-Gaussian and non-stationary data.
We apply the method to predicting PM2.5 concentrations across the continental United States.
arXiv Detail & Related papers (2020-07-23T12:38:53Z) - Asymptotic Analysis of an Ensemble of Randomly Projected Linear
Discriminants [94.46276668068327]
In [1], an ensemble of randomly projected linear discriminants is used to classify datasets.
We develop a consistent estimator of the misclassification probability as an alternative to the computationally-costly cross-validation estimator.
We also demonstrate the use of our estimator for tuning the projection dimension on both real and synthetic data.
arXiv Detail & Related papers (2020-04-17T12:47:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.