Estimating Probability Densities with Transformer and Denoising Diffusion
- URL: http://arxiv.org/abs/2407.15703v1
- Date: Mon, 22 Jul 2024 15:10:41 GMT
- Title: Estimating Probability Densities with Transformer and Denoising Diffusion
- Authors: Henry W. Leung, Jo Bovy, Joshua S. Speagle,
- Abstract summary: We show that training a probabilistic model using a denoising diffusion head on top of the Transformer provides reasonable probability density estimation.
We illustrate our Transformer+Denoising Diffusion model by training it on a large dataset of astronomical observations and measured labels of stars within our Galaxy.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Transformers are often the go-to architecture to build foundation models that ingest a large amount of training data. But these models do not estimate the probability density distribution when trained on regression problems, yet obtaining full probabilistic outputs is crucial to many fields of science, where the probability distribution of the answer can be non-Gaussian and multimodal. In this work, we demonstrate that training a probabilistic model using a denoising diffusion head on top of the Transformer provides reasonable probability density estimation even for high-dimensional inputs. The combined Transformer+Denoising Diffusion model allows conditioning the output probability density on arbitrary combinations of inputs and it is thus a highly flexible density function emulator of all possible input/output combinations. We illustrate our Transformer+Denoising Diffusion model by training it on a large dataset of astronomical observations and measured labels of stars within our Galaxy and we apply it to a variety of inference tasks to show that the model can infer labels accurately with reasonable distributions.
Related papers
- Constrained Diffusion Models via Dual Training [80.03953599062365]
We develop constrained diffusion models based on desired distributions informed by requirements.
We show that our constrained diffusion models generate new data from a mixture data distribution that achieves the optimal trade-off among objective and constraints.
arXiv Detail & Related papers (2024-08-27T14:25:42Z) - Theoretical Insights for Diffusion Guidance: A Case Study for Gaussian
Mixture Models [59.331993845831946]
Diffusion models benefit from instillation of task-specific information into the score function to steer the sample generation towards desired properties.
This paper provides the first theoretical study towards understanding the influence of guidance on diffusion models in the context of Gaussian mixture models.
arXiv Detail & Related papers (2024-03-03T23:15:48Z) - Diffusion Random Feature Model [0.0]
We present a diffusion model-inspired deep random feature model that is interpretable.
We derive generalization bounds between the distribution of sampled data and the true distribution using properties of score matching.
We validate our findings by generating samples on the fashion MNIST dataset and instrumental audio data.
arXiv Detail & Related papers (2023-10-06T17:59:05Z) - Score-based Diffusion Models in Function Space [140.792362459734]
Diffusion models have recently emerged as a powerful framework for generative modeling.
We introduce a mathematically rigorous framework called Denoising Diffusion Operators (DDOs) for training diffusion models in function space.
We show that the corresponding discretized algorithm generates accurate samples at a fixed cost independent of the data resolution.
arXiv Detail & Related papers (2023-02-14T23:50:53Z) - Machine-Learned Exclusion Limits without Binning [0.0]
We extend the Machine-Learned Likelihoods (MLL) method by including Kernel Density Estimators (KDE) to extract one-dimensional signal and background probability density functions.
We apply the method to two cases of interest at the LHC: a search for exotic Higgs bosons, and a $Z'$ boson decaying into lepton pairs.
arXiv Detail & Related papers (2022-11-09T11:04:50Z) - DensePure: Understanding Diffusion Models towards Adversarial Robustness [110.84015494617528]
We analyze the properties of diffusion models and establish the conditions under which they can enhance certified robustness.
We propose a new method DensePure, designed to improve the certified robustness of a pretrained model (i.e. a classifier)
We show that this robust region is a union of multiple convex sets, and is potentially much larger than the robust regions identified in previous works.
arXiv Detail & Related papers (2022-11-01T08:18:07Z) - Learning Multivariate CDFs and Copulas using Tensor Factorization [39.24470798045442]
Learning the multivariate distribution of data is a core challenge in statistics and machine learning.
In this work, we aim to learn multivariate cumulative distribution functions (CDFs), as they can handle mixed random variables.
We show that any grid sampled version of a joint CDF of mixed random variables admits a universal representation as a naive Bayes model.
We demonstrate the superior performance of the proposed model in several synthetic and real datasets and applications including regression, sampling and data imputation.
arXiv Detail & Related papers (2022-10-13T16:18:46Z) - Variational Mixture of Normalizing Flows [0.0]
Deep generative models, such as generative adversarial networks autociteGAN, variational autoencoders autocitevaepaper, and their variants, have seen wide adoption for the task of modelling complex data distributions.
Normalizing flows have overcome this limitation by leveraging the change-of-suchs formula for probability density functions.
The present work overcomes this by using normalizing flows as components in a mixture model and devising an end-to-end training procedure for such a model.
arXiv Detail & Related papers (2020-09-01T17:20:08Z) - Denoising Diffusion Probabilistic Models [91.94962645056896]
We present high quality image synthesis results using diffusion probabilistic models.
Our best results are obtained by training on a weighted variational bound designed according to a novel connection between diffusion probabilistic models and denoising score matching with Langevin dynamics.
arXiv Detail & Related papers (2020-06-19T17:24:44Z) - Learning Likelihoods with Conditional Normalizing Flows [54.60456010771409]
Conditional normalizing flows (CNFs) are efficient in sampling and inference.
We present a study of CNFs where the base density to output space mapping is conditioned on an input x, to model conditional densities p(y|x)
arXiv Detail & Related papers (2019-11-29T19:17:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.