Related papers: Estimating Probability Densities with Transformer and Denoising Diffusion

Estimating Probability Densities with Transformer and Denoising Diffusion

URL: http://arxiv.org/abs/2407.15703v1
Date: Mon, 22 Jul 2024 15:10:41 GMT
Title: Estimating Probability Densities with Transformer and Denoising Diffusion
Authors: Henry W. Leung, Jo Bovy, Joshua S. Speagle,
Abstract summary: We show that training a probabilistic model using a denoising diffusion head on top of the Transformer provides reasonable probability density estimation. We illustrate our Transformer+Denoising Diffusion model by training it on a large dataset of astronomical observations and measured labels of stars within our Galaxy.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Transformers are often the go-to architecture to build foundation models that ingest a large amount of training data. But these models do not estimate the probability density distribution when trained on regression problems, yet obtaining full probabilistic outputs is crucial to many fields of science, where the probability distribution of the answer can be non-Gaussian and multimodal. In this work, we demonstrate that training a probabilistic model using a denoising diffusion head on top of the Transformer provides reasonable probability density estimation even for high-dimensional inputs. The combined Transformer+Denoising Diffusion model allows conditioning the output probability density on arbitrary combinations of inputs and it is thus a highly flexible density function emulator of all possible input/output combinations. We illustrate our Transformer+Denoising Diffusion model by training it on a large dataset of astronomical observations and measured labels of stars within our Galaxy and we apply it to a variety of inference tasks to show that the model can infer labels accurately with reasonable distributions.

Related papers

Diffusion models for multivariate subsurface generation and efficient probabilistic inversion [0.0]
Diffusion models offer stable training and state-of-the-art performance for deep generative modeling tasks.<n>We introduce a likelihood approximation accounting for the noise-contamination that is inherent in diffusion modeling.<n>Our tests show significantly improved statistical robustness, enhanced sampling of the posterior probability density function.
arXiv Detail & Related papers (2025-07-21T17:10:16Z)
Overcoming Dimensional Factorization Limits in Discrete Diffusion Models through Quantum Joint Distribution Learning [79.65014491424151]
We propose a quantum Discrete Denoising Diffusion Probabilistic Model (QD3PM)<n>It enables joint probability learning through diffusion and denoising in exponentially large Hilbert spaces.<n>This paper establishes a new theoretical paradigm in generative models by leveraging the quantum advantage in joint distribution learning.
arXiv Detail & Related papers (2025-05-08T11:48:21Z)
Dimension-Free Convergence of Diffusion Models for Approximate Gaussian Mixtures [18.828955620788566]
Diffusion models are distinguished by their exceptional generative performance. This paper investigates the effectiveness of diffusion models in sampling from complex high-dimensional distributions.
arXiv Detail & Related papers (2025-04-07T17:59:07Z)
Adaptivity and Convergence of Probability Flow ODEs in Diffusion Generative Models [5.064404027153094]
This paper contributes to establishing theoretical guarantees for the probability flow ODE, a diffusion-based sampler known for its practical efficiency. We demonstrate that, with accurate score function estimation, the probability flow ODE sampler achieves a convergence rate of $O(k/T)$ in total variation distance. This dimension-free convergence rate improves upon existing results that scale with the typically much larger ambient dimension.
arXiv Detail & Related papers (2025-01-31T03:10:10Z)
Constrained Diffusion Models via Dual Training [80.03953599062365]
Diffusion processes are prone to generating samples that reflect biases in a training dataset. We develop constrained diffusion models by imposing diffusion constraints based on desired distributions. We show that our constrained diffusion models generate new data from a mixture data distribution that achieves the optimal trade-off among objective and constraints.
arXiv Detail & Related papers (2024-08-27T14:25:42Z)
Theoretical Insights for Diffusion Guidance: A Case Study for Gaussian Mixture Models [59.331993845831946]
Diffusion models benefit from instillation of task-specific information into the score function to steer the sample generation towards desired properties. This paper provides the first theoretical study towards understanding the influence of guidance on diffusion models in the context of Gaussian mixture models.
arXiv Detail & Related papers (2024-03-03T23:15:48Z)
Diffusion Random Feature Model [0.0]
We present a diffusion model-inspired deep random feature model that is interpretable. We derive generalization bounds between the distribution of sampled data and the true distribution using properties of score matching. We validate our findings by generating samples on the fashion MNIST dataset and instrumental audio data.
arXiv Detail & Related papers (2023-10-06T17:59:05Z)
Score-based Diffusion Models in Function Space [140.792362459734]
Diffusion models have recently emerged as a powerful framework for generative modeling. We introduce a mathematically rigorous framework called Denoising Diffusion Operators (DDOs) for training diffusion models in function space. We show that the corresponding discretized algorithm generates accurate samples at a fixed cost independent of the data resolution.
arXiv Detail & Related papers (2023-02-14T23:50:53Z)
Machine-Learned Exclusion Limits without Binning [0.0]
We extend the Machine-Learned Likelihoods (MLL) method by including Kernel Density Estimators (KDE) to extract one-dimensional signal and background probability density functions. We apply the method to two cases of interest at the LHC: a search for exotic Higgs bosons, and a $Z'$ boson decaying into lepton pairs.
arXiv Detail & Related papers (2022-11-09T11:04:50Z)
DensePure: Understanding Diffusion Models towards Adversarial Robustness [110.84015494617528]
We analyze the properties of diffusion models and establish the conditions under which they can enhance certified robustness. We propose a new method DensePure, designed to improve the certified robustness of a pretrained model (i.e. a classifier) We show that this robust region is a union of multiple convex sets, and is potentially much larger than the robust regions identified in previous works.
arXiv Detail & Related papers (2022-11-01T08:18:07Z)
Learning Multivariate CDFs and Copulas using Tensor Factorization [39.24470798045442]
Learning the multivariate distribution of data is a core challenge in statistics and machine learning. In this work, we aim to learn multivariate cumulative distribution functions (CDFs), as they can handle mixed random variables. We show that any grid sampled version of a joint CDF of mixed random variables admits a universal representation as a naive Bayes model. We demonstrate the superior performance of the proposed model in several synthetic and real datasets and applications including regression, sampling and data imputation.
arXiv Detail & Related papers (2022-10-13T16:18:46Z)
Variational Mixture of Normalizing Flows [0.0]
Deep generative models, such as generative adversarial networks autociteGAN, variational autoencoders autocitevaepaper, and their variants, have seen wide adoption for the task of modelling complex data distributions. Normalizing flows have overcome this limitation by leveraging the change-of-suchs formula for probability density functions. The present work overcomes this by using normalizing flows as components in a mixture model and devising an end-to-end training procedure for such a model.
arXiv Detail & Related papers (2020-09-01T17:20:08Z)
Denoising Diffusion Probabilistic Models [91.94962645056896]
We present high quality image synthesis results using diffusion probabilistic models. Our best results are obtained by training on a weighted variational bound designed according to a novel connection between diffusion probabilistic models and denoising score matching with Langevin dynamics.
arXiv Detail & Related papers (2020-06-19T17:24:44Z)
Learning Likelihoods with Conditional Normalizing Flows [54.60456010771409]
Conditional normalizing flows (CNFs) are efficient in sampling and inference. We present a study of CNFs where the base density to output space mapping is conditioned on an input x, to model conditional densities p(y|x)
arXiv Detail & Related papers (2019-11-29T19:17:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.