Sampling from Arbitrary Functions via PSD Models
- URL: http://arxiv.org/abs/2110.10527v1
- Date: Wed, 20 Oct 2021 12:25:22 GMT
- Title: Sampling from Arbitrary Functions via PSD Models
- Authors: Ulysse Marteau-Ferey (SIERRA, PSL), Alessandro Rudi (PSL, SIERRA),
Francis Bach (PSL, SIERRA)
- Abstract summary: We take a two-step approach by first modeling the probability distribution and then sampling from that model.
We show that these models can approximate a large class of densities concisely using few evaluations, and present a simple algorithm to effectively sample from these models.
- Score: 55.41644538483948
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In many areas of applied statistics and machine learning, generating an
arbitrary number of independent and identically distributed (i.i.d.) samples
from a given distribution is a key task. When the distribution is known only
through evaluations of the density, current methods either scale badly with the
dimension or require very involved implementations. Instead, we take a two-step
approach by first modeling the probability distribution and then sampling from
that model. We use the recently introduced class of positive semi-definite
(PSD) models, which have been shown to be efficient for approximating
probability densities. We show that these models can approximate a large class
of densities concisely using few evaluations, and present a simple algorithm to
effectively sample from these models. We also present preliminary empirical
results to illustrate our assertions.
Related papers
- Minimax Optimality of the Probability Flow ODE for Diffusion Models [8.15094483029656]
This work develops the first end-to-end theoretical framework for deterministic ODE-based samplers.
We propose a smooth regularized score estimator that simultaneously controls both the $L2$ score error and the associated mean Jacobian error.
We demonstrate that the resulting sampler achieves the minimax rate in total variation distance, modulo logarithmic factors.
arXiv Detail & Related papers (2025-03-12T17:51:29Z) - Rethinking Diffusion Model in High Dimension [0.0]
Diffusion models assume that they can learn the statistical properties of the underlying probability distribution.
This paper conducts a detailed analysis of the objective function and inference methods of diffusion models.
arXiv Detail & Related papers (2025-03-11T17:36:11Z) - Feynman-Kac Correctors in Diffusion: Annealing, Guidance, and Product of Experts [64.34482582690927]
We provide an efficient and principled method for sampling from a sequence of annealed, geometric-averaged, or product distributions derived from pretrained score-based models.
We propose Sequential Monte Carlo (SMC) resampling algorithms that leverage inference-time scaling to improve sampling quality.
arXiv Detail & Related papers (2025-03-04T17:46:51Z) - Mixture models for data with unknown distributions [0.6345523830122168]
We describe and analyze a broad class of mixture models for real-valued multivariate data.
We return both a division of the data and an estimate of the distributions, effectively performing clustering and density estimation within each cluster at the same time.
We demonstrate our methods with a selection of illustrative applications and give code implementing both algorithms.
arXiv Detail & Related papers (2025-02-26T22:42:40Z) - Accelerated Diffusion Models via Speculative Sampling [89.43940130493233]
Speculative sampling is a popular technique for accelerating inference in Large Language Models.
We extend speculative sampling to diffusion models, which generate samples via continuous, vector-valued Markov chains.
We propose various drafting strategies, including a simple and effective approach that does not require training a draft model.
arXiv Detail & Related papers (2025-01-09T16:50:16Z) - Unified Convergence Analysis for Score-Based Diffusion Models with Deterministic Samplers [49.1574468325115]
We introduce a unified convergence analysis framework for deterministic samplers.
Our framework achieves iteration complexity of $tilde O(d2/epsilon)$.
We also provide a detailed analysis of Denoising Implicit Diffusion Models (DDIM)-type samplers.
arXiv Detail & Related papers (2024-10-18T07:37:36Z) - Modelling Sampling Distributions of Test Statistics with Autograd [0.0]
We explore whether this approach to modeling conditional 1-dimensional sampling distributions is a viable alternative to the probability density-ratio method.
Relatively simple, yet effective, neural network models are used whose predictive uncertainty is quantified through a variety of methods.
arXiv Detail & Related papers (2024-05-03T21:34:12Z) - Training Implicit Generative Models via an Invariant Statistical Loss [3.139474253994318]
Implicit generative models have the capability to learn arbitrary complex data distributions.
On the downside, training requires telling apart real data from artificially-generated ones using adversarial discriminators.
We develop a discriminator-free method for training one-dimensional (1D) generative implicit models.
arXiv Detail & Related papers (2024-02-26T09:32:28Z) - PQMass: Probabilistic Assessment of the Quality of Generative Models
using Probability Mass Estimation [8.527898482146103]
We propose a comprehensive sample-based method for assessing the quality of generative models.
The proposed approach enables the estimation of the probability that two sets of samples are drawn from the same distribution.
arXiv Detail & Related papers (2024-02-06T19:39:26Z) - Learning Multivariate CDFs and Copulas using Tensor Factorization [39.24470798045442]
Learning the multivariate distribution of data is a core challenge in statistics and machine learning.
In this work, we aim to learn multivariate cumulative distribution functions (CDFs), as they can handle mixed random variables.
We show that any grid sampled version of a joint CDF of mixed random variables admits a universal representation as a naive Bayes model.
We demonstrate the superior performance of the proposed model in several synthetic and real datasets and applications including regression, sampling and data imputation.
arXiv Detail & Related papers (2022-10-13T16:18:46Z) - Unrolling Particles: Unsupervised Learning of Sampling Distributions [102.72972137287728]
Particle filtering is used to compute good nonlinear estimates of complex systems.
We show in simulations that the resulting particle filter yields good estimates in a wide range of scenarios.
arXiv Detail & Related papers (2021-10-06T16:58:34Z) - PSD Representations for Effective Probability Models [117.35298398434628]
We show that a recently proposed class of positive semi-definite (PSD) models for non-negative functions is particularly suited to this end.
We characterize both approximation and generalization capabilities of PSD models, showing that they enjoy strong theoretical guarantees.
Our results open the way to applications of PSD models to density estimation, decision theory and inference.
arXiv Detail & Related papers (2021-06-30T15:13:39Z) - Good Classifiers are Abundant in the Interpolating Regime [64.72044662855612]
We develop a methodology to compute precisely the full distribution of test errors among interpolating classifiers.
We find that test errors tend to concentrate around a small typical value $varepsilon*$, which deviates substantially from the test error of worst-case interpolating model.
Our results show that the usual style of analysis in statistical learning theory may not be fine-grained enough to capture the good generalization performance observed in practice.
arXiv Detail & Related papers (2020-06-22T21:12:31Z) - Efficiently Sampling Functions from Gaussian Process Posteriors [76.94808614373609]
We propose an easy-to-use and general-purpose approach for fast posterior sampling.
We demonstrate how decoupled sample paths accurately represent Gaussian process posteriors at a fraction of the usual cost.
arXiv Detail & Related papers (2020-02-21T14:03:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.