Mean-Square Analysis of Discretized It\^o Diffusions for Heavy-tailed
Sampling
- URL: http://arxiv.org/abs/2303.00570v1
- Date: Wed, 1 Mar 2023 15:16:03 GMT
- Title: Mean-Square Analysis of Discretized It\^o Diffusions for Heavy-tailed
Sampling
- Authors: Ye He, Tyler Farghly, Krishnakumar Balasubramanian, Murat A. Erdogdu
- Abstract summary: We analyze the complexity of sampling from a class of heavy-tailed distributions by discretizing a natural class of Ito diffusions associated with weighted Poincar'e inequalities.
Based on a mean-square analysis, we establish the iteration complexity for obtaining a sample whose distribution is $epsilon$ close to the target distribution in the Wasserstein-2 metric.
- Score: 17.415391025051434
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We analyze the complexity of sampling from a class of heavy-tailed
distributions by discretizing a natural class of It\^o diffusions associated
with weighted Poincar\'e inequalities. Based on a mean-square analysis, we
establish the iteration complexity for obtaining a sample whose distribution is
$\epsilon$ close to the target distribution in the Wasserstein-2 metric. In
this paper, our results take the mean-square analysis to its limits, i.e., we
invariably only require that the target density has finite variance, the
minimal requirement for a mean-square analysis. To obtain explicit estimates,
we compute upper bounds on certain moments associated with heavy-tailed targets
under various assumptions. We also provide similar iteration complexity results
for the case where only function evaluations of the unnormalized target density
are available by estimating the gradients using a Gaussian smoothing technique.
We provide illustrative examples based on the multivariate $t$-distribution.
Related papers
- Theory on Score-Mismatched Diffusion Models and Zero-Shot Conditional Samplers [49.97755400231656]
We present the first performance guarantee with explicit dimensional general score-mismatched diffusion samplers.
We show that score mismatches result in an distributional bias between the target and sampling distributions, proportional to the accumulated mismatch between the target and training distributions.
This result can be directly applied to zero-shot conditional samplers for any conditional model, irrespective of measurement noise.
arXiv Detail & Related papers (2024-10-17T16:42:12Z) - Convergence of Score-Based Discrete Diffusion Models: A Discrete-Time Analysis [56.442307356162864]
We study the theoretical aspects of score-based discrete diffusion models under the Continuous Time Markov Chain (CTMC) framework.
We introduce a discrete-time sampling algorithm in the general state space $[S]d$ that utilizes score estimators at predefined time points.
Our convergence analysis employs a Girsanov-based method and establishes key properties of the discrete score function.
arXiv Detail & Related papers (2024-10-03T09:07:13Z) - Classification of Heavy-tailed Features in High Dimensions: a
Superstatistical Approach [1.4469725791865984]
We characterise the learning of a mixture of two clouds of data points with generic centroids.
We study the generalisation performance of the obtained estimator, we analyse the role of regularisation, and we analytically the separability transition.
arXiv Detail & Related papers (2023-04-06T07:53:05Z) - Outlier-Robust Sparse Mean Estimation for Heavy-Tailed Distributions [42.6763105645717]
Given a small number of corrupted samples, the goal is to efficiently compute a hypothesis that accurately approximates $mu$ with high probability.
Our algorithm achieves the optimal error using a number of samples scaling logarithmically with the ambient dimension.
Our analysis may be of independent interest, involving the delicate design of a (non-spectral) decomposition for positive semi-definite satisfying certain sparsity properties.
arXiv Detail & Related papers (2022-11-29T16:13:50Z) - Efficient CDF Approximations for Normalizing Flows [64.60846767084877]
We build upon the diffeomorphic properties of normalizing flows to estimate the cumulative distribution function (CDF) over a closed region.
Our experiments on popular flow architectures and UCI datasets show a marked improvement in sample efficiency as compared to traditional estimators.
arXiv Detail & Related papers (2022-02-23T06:11:49Z) - Optimal 1-Wasserstein Distance for WGANs [2.1174215880331775]
We provide a thorough analysis of Wasserstein GANs (WGANs) in both the finite sample and regimes.
We derive in passing new results on optimal transport theory in the semi-discrete setting.
arXiv Detail & Related papers (2022-01-08T13:04:03Z) - Unrolling Particles: Unsupervised Learning of Sampling Distributions [102.72972137287728]
Particle filtering is used to compute good nonlinear estimates of complex systems.
We show in simulations that the resulting particle filter yields good estimates in a wide range of scenarios.
arXiv Detail & Related papers (2021-10-06T16:58:34Z) - Relative Entropy Gradient Sampler for Unnormalized Distributions [14.060615420986796]
Relative entropy gradient sampler (REGS) for sampling from unnormalized distributions.
REGS is a particle method that seeks a sequence of simple nonlinear transforms iteratively pushing the initial samples from a reference distribution into the samples from an unnormalized target distribution.
arXiv Detail & Related papers (2021-10-06T14:10:38Z) - Heavy-tailed Streaming Statistical Estimation [58.70341336199497]
We consider the task of heavy-tailed statistical estimation given streaming $p$ samples.
We design a clipped gradient descent and provide an improved analysis under a more nuanced condition on the noise of gradients.
arXiv Detail & Related papers (2021-08-25T21:30:27Z) - Minimax Optimal Estimation of KL Divergence for Continuous Distributions [56.29748742084386]
Esting Kullback-Leibler divergence from identical and independently distributed samples is an important problem in various domains.
One simple and effective estimator is based on the k nearest neighbor between these samples.
arXiv Detail & Related papers (2020-02-26T16:37:37Z) - Asymptotic Analysis of Sampling Estimators for Randomized Numerical
Linear Algebra Algorithms [43.134933182911766]
We develop an analysis to derive the distribution of RandNLA sampling estimators for the least-squares problem.
We identify optimal sampling probabilities based on the Asymptotic Mean Squared Error (AMSE) and the Expected Asymptotic Mean Squared Error (EAMSE)
Our theoretical results clarify the role of leverage in the sampling process, and our empirical results demonstrate improvements over existing methods.
arXiv Detail & Related papers (2020-02-24T20:34:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.