Related papers: A likelihood approach to nonparametric estimation of a singular distribution using deep generative models

A likelihood approach to nonparametric estimation of a singular distribution using deep generative models

URL: http://arxiv.org/abs/2105.04046v3
Date: Tue, 28 Mar 2023 10:19:58 GMT
Title: A likelihood approach to nonparametric estimation of a singular distribution using deep generative models
Authors: Minwoo Chae, Dongha Kim, Yongdai Kim, Lizhen Lin
Abstract summary: We investigate a likelihood approach to nonparametric estimation of a singular distribution using deep generative models. We prove that a novel and effective solution exists by perturbing the data with an instance noise. We also characterize the class of distributions that can be efficiently estimated via deep generative models.
Score: 4.329951775163721
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: We investigate statistical properties of a likelihood approach to nonparametric estimation of a singular distribution using deep generative models. More specifically, a deep generative model is used to model high-dimensional data that are assumed to concentrate around some low-dimensional structure. Estimating the distribution supported on this low-dimensional structure, such as a low-dimensional manifold, is challenging due to its singularity with respect to the Lebesgue measure in the ambient space. In the considered model, a usual likelihood approach can fail to estimate the target distribution consistently due to the singularity. We prove that a novel and effective solution exists by perturbing the data with an instance noise, which leads to consistent estimation of the underlying distribution with desirable convergence rates. We also characterize the class of distributions that can be efficiently estimated via deep generative models. This class is sufficiently general to contain various structured distributions such as product distributions, classically smooth distributions and distributions supported on a low-dimensional manifold. Our analysis provides some insights on how deep generative models can avoid the curse of dimensionality for nonparametric distribution estimation. We conduct a thorough simulation study and real data analysis to empirically demonstrate that the proposed data perturbation technique improves the estimation performance significantly.

Related papers

Model averaging in the space of probability distributions [0.0]
We study aggregation schemes in the space of probability distributions metrized in terms of the Wasserstein distance.<n>We employ regularization schemes motivated by the standard elastic net penalization, which is shown to consistently yield models enjoying sparsity properties.<n>The proposed approach is applied to a real-world dataset of insurance losses to estimate the claim size distribution and the associated tail risk.
arXiv Detail & Related papers (2025-07-15T20:41:57Z)
Theory on Score-Mismatched Diffusion Models and Zero-Shot Conditional Samplers [49.97755400231656]
We present the first performance guarantee with explicit dimensional general score-mismatched diffusion samplers. We show that score mismatches result in an distributional bias between the target and sampling distributions, proportional to the accumulated mismatch between the target and training distributions. This result can be directly applied to zero-shot conditional samplers for any conditional model, irrespective of measurement noise.
arXiv Detail & Related papers (2024-10-17T16:42:12Z)
A Likelihood Based Approach to Distribution Regression Using Conditional Deep Generative Models [6.647819824559201]
We study the large-sample properties of a likelihood-based approach for estimating conditional deep generative models. Our results lead to the convergence rate of a sieve maximum likelihood estimator for estimating the conditional distribution.
arXiv Detail & Related papers (2024-10-02T20:46:21Z)
Inflationary Flows: Calibrated Bayesian Inference with Diffusion-Based Models [0.0]
We show how diffusion-based models can be repurposed for performing principled, identifiable Bayesian inference. We show how such maps can be learned via standard DBM training using a novel noise schedule. The result is a class of highly expressive generative models, uniquely defined on a low-dimensional latent space.
arXiv Detail & Related papers (2024-07-11T19:58:19Z)
Towards Theoretical Understandings of Self-Consuming Generative Models [56.84592466204185]
This paper tackles the emerging challenge of training generative models within a self-consuming loop. We construct a theoretical framework to rigorously evaluate how this training procedure impacts the data distributions learned by future models. We present results for kernel density estimation, delivering nuanced insights such as the impact of mixed data training on error propagation.
arXiv Detail & Related papers (2024-02-19T02:08:09Z)
Empirical Density Estimation based on Spline Quasi-Interpolation with applications to Copulas clustering modeling [0.0]
Density estimation is a fundamental technique employed in various fields to model and to understand the underlying distribution of data. In this paper we propose the mono-variate approximation of the density using quasi-interpolation. The presented algorithm is validated on artificial and real datasets.
arXiv Detail & Related papers (2024-02-18T11:49:38Z)
PQMass: Probabilistic Assessment of the Quality of Generative Models using Probability Mass Estimation [8.527898482146103]
We propose a comprehensive sample-based method for assessing the quality of generative models. The proposed approach enables the estimation of the probability that two sets of samples are drawn from the same distribution.
arXiv Detail & Related papers (2024-02-06T19:39:26Z)
Diffusion Models are Minimax Optimal Distribution Estimators [49.47503258639454]
We provide the first rigorous analysis on approximation and generalization abilities of diffusion modeling. We show that when the true density function belongs to the Besov space and the empirical score matching loss is properly minimized, the generated data distribution achieves the nearly minimax optimal estimation rates.
arXiv Detail & Related papers (2023-03-03T11:31:55Z)
Score Approximation, Estimation and Distribution Recovery of Diffusion Models on Low-Dimensional Data [68.62134204367668]
This paper studies score approximation, estimation, and distribution recovery of diffusion models, when data are supported on an unknown low-dimensional linear subspace. We show that with a properly chosen neural network architecture, the score function can be both accurately approximated and efficiently estimated. The generated distribution based on the estimated score function captures the data geometric structures and converges to a close vicinity of the data distribution.
arXiv Detail & Related papers (2023-02-14T17:02:35Z)
Efficient CDF Approximations for Normalizing Flows [64.60846767084877]
We build upon the diffeomorphic properties of normalizing flows to estimate the cumulative distribution function (CDF) over a closed region. Our experiments on popular flow architectures and UCI datasets show a marked improvement in sample efficiency as compared to traditional estimators.
arXiv Detail & Related papers (2022-02-23T06:11:49Z)
Sampling from Arbitrary Functions via PSD Models [55.41644538483948]
We take a two-step approach by first modeling the probability distribution and then sampling from that model. We show that these models can approximate a large class of densities concisely using few evaluations, and present a simple algorithm to effectively sample from these models.
arXiv Detail & Related papers (2021-10-20T12:25:22Z)
Achieving Efficiency in Black Box Simulation of Distribution Tails with Self-structuring Importance Samplers [1.6114012813668934]
The paper presents a novel Importance Sampling (IS) scheme for estimating distribution of performance measures modeled with a rich set of tools such as linear programs, integer linear programs, piecewise linear/quadratic objectives, feature maps specified with deep neural networks, etc.
arXiv Detail & Related papers (2021-02-14T03:37:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.