A quasi-Bayesian sequential approach to deconvolution density estimation
- URL: http://arxiv.org/abs/2408.14402v1
- Date: Mon, 26 Aug 2024 16:40:04 GMT
- Title: A quasi-Bayesian sequential approach to deconvolution density estimation
- Authors: Stefano Favaro, Sandra Fortini,
- Abstract summary: Density deconvolution addresses the estimation of the unknown density function $f$ of a random signal from data.
We consider the problem of density deconvolution in a streaming or online setting where noisy data arrive progressively.
By relying on a quasi-Bayesian sequential approach, we obtain estimates of $f$ that are of easy evaluation.
- Score: 7.10052009802944
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Density deconvolution addresses the estimation of the unknown (probability) density function $f$ of a random signal from data that are observed with an independent additive random noise. This is a classical problem in statistics, for which frequentist and Bayesian nonparametric approaches are available to deal with static or batch data. In this paper, we consider the problem of density deconvolution in a streaming or online setting where noisy data arrive progressively, with no predetermined sample size, and we develop a sequential nonparametric approach to estimate $f$. By relying on a quasi-Bayesian sequential approach, often referred to as Newton's algorithm, we obtain estimates of $f$ that are of easy evaluation, computationally efficient, and with a computational cost that remains constant as the amount of data increases, which is critical in the streaming setting. Large sample asymptotic properties of the proposed estimates are studied, yielding provable guarantees with respect to the estimation of $f$ at a point (local) and on an interval (uniform). In particular, we establish local and uniform central limit theorems, providing corresponding asymptotic credible intervals and bands. We validate empirically our methods on synthetic and real data, by considering the common setting of Laplace and Gaussian noise distributions, and make a comparison with respect to the kernel-based approach and a Bayesian nonparametric approach with a Dirichlet process mixture prior.
Related papers
- Parallel simulation for sampling under isoperimetry and score-based diffusion models [56.39904484784127]
As data size grows, reducing the iteration cost becomes an important goal.
Inspired by the success of the parallel simulation of the initial value problem in scientific computation, we propose parallel Picard methods for sampling tasks.
Our work highlights the potential advantages of simulation methods in scientific computation for dynamics-based sampling and diffusion models.
arXiv Detail & Related papers (2024-12-10T11:50:46Z) - Linear cost and exponentially convergent approximation of Gaussian Matérn processes on intervals [43.341057405337295]
computational cost for inference and prediction of statistical models based on Gaussian processes scales cubicly with the number of observations.
We develop a method with linear cost and with a covariance error that decreases exponentially fast in the order $m$ of the proposed approximation.
The method is based on an optimal rational approximation of the spectral density and results in an approximation that can be represented as a sum of $m$ independent Markov processes.
arXiv Detail & Related papers (2024-10-16T19:57:15Z) - O(d/T) Convergence Theory for Diffusion Probabilistic Models under Minimal Assumptions [6.76974373198208]
We establish a fast convergence theory for the denoising diffusion probabilistic model (DDPM) under minimal assumptions.
We show that the convergence rate improves to $O(k/T)$, where $k$ is the intrinsic dimension of the target data distribution.
This highlights the ability of DDPM to automatically adapt to unknown low-dimensional structures.
arXiv Detail & Related papers (2024-09-27T17:59:10Z) - Non-asymptotic bounds for forward processes in denoising diffusions: Ornstein-Uhlenbeck is hard to beat [49.1574468325115]
This paper presents explicit non-asymptotic bounds on the forward diffusion error in total variation (TV)
We parametrise multi-modal data distributions in terms of the distance $R$ to their furthest modes and consider forward diffusions with additive and multiplicative noise.
arXiv Detail & Related papers (2024-08-25T10:28:31Z) - On diffusion-based generative models and their error bounds: The log-concave case with full convergence estimates [5.13323375365494]
We provide theoretical guarantees for the convergence behaviour of diffusion-based generative models under strongly log-concave data.
Our class of functions used for score estimation is made of Lipschitz continuous functions avoiding any Lipschitzness assumption on the score function.
This approach yields the best known convergence rate for our sampling algorithm.
arXiv Detail & Related papers (2023-11-22T18:40:45Z) - Sobolev Space Regularised Pre Density Models [51.558848491038916]
We propose a new approach to non-parametric density estimation that is based on regularizing a Sobolev norm of the density.
This method is statistically consistent, and makes the inductive validation model clear and consistent.
arXiv Detail & Related papers (2023-07-25T18:47:53Z) - Towards Faster Non-Asymptotic Convergence for Diffusion-Based Generative
Models [49.81937966106691]
We develop a suite of non-asymptotic theory towards understanding the data generation process of diffusion models.
In contrast to prior works, our theory is developed based on an elementary yet versatile non-asymptotic approach.
arXiv Detail & Related papers (2023-06-15T16:30:08Z) - Statistical Inference with Stochastic Gradient Methods under
$\phi$-mixing Data [9.77185962310918]
We propose a mini-batch SGD estimator for statistical inference when the data is $phi$-mixing.
The confidence intervals are constructed using an associated mini-batch SGD procedure.
The proposed method is memory-efficient and easy to implement in practice.
arXiv Detail & Related papers (2023-02-24T16:16:43Z) - Optimizing the Noise in Self-Supervised Learning: from Importance
Sampling to Noise-Contrastive Estimation [80.07065346699005]
It is widely assumed that the optimal noise distribution should be made equal to the data distribution, as in Generative Adversarial Networks (GANs)
We turn to Noise-Contrastive Estimation which grounds this self-supervised task as an estimation problem of an energy-based model of the data.
We soberly conclude that the optimal noise may be hard to sample from, and the gain in efficiency can be modest compared to choosing the noise distribution equal to the data's.
arXiv Detail & Related papers (2023-01-23T19:57:58Z) - Score-based Continuous-time Discrete Diffusion Models [102.65769839899315]
We extend diffusion models to discrete variables by introducing a Markov jump process where the reverse process denoises via a continuous-time Markov chain.
We show that an unbiased estimator can be obtained via simple matching the conditional marginal distributions.
We demonstrate the effectiveness of the proposed method on a set of synthetic and real-world music and image benchmarks.
arXiv Detail & Related papers (2022-11-30T05:33:29Z) - Statistical Efficiency of Score Matching: The View from Isoperimetry [96.65637602827942]
We show a tight connection between statistical efficiency of score matching and the isoperimetric properties of the distribution being estimated.
We formalize these results both in the sample regime and in the finite regime.
arXiv Detail & Related papers (2022-10-03T06:09:01Z) - Robust Inference of Manifold Density and Geometry by Doubly Stochastic
Scaling [8.271859911016719]
We develop tools for robust inference under high-dimensional noise.
We show that our approach is robust to variability in technical noise levels across cell types.
arXiv Detail & Related papers (2022-09-16T15:39:11Z) - Convergence for score-based generative modeling with polynomial
complexity [9.953088581242845]
We prove the first convergence guarantees for the core mechanic behind Score-based generative modeling.
Compared to previous works, we do not incur error that grows exponentially in time or that suffers from a curse of dimensionality.
We show that a predictor-corrector gives better convergence than using either portion alone.
arXiv Detail & Related papers (2022-06-13T14:57:35Z) - The Optimal Noise in Noise-Contrastive Learning Is Not What You Think [80.07065346699005]
We show that deviating from this assumption can actually lead to better statistical estimators.
In particular, the optimal noise distribution is different from the data's and even from a different family.
arXiv Detail & Related papers (2022-03-02T13:59:20Z) - A Non-Classical Parameterization for Density Estimation Using Sample
Moments [0.0]
We propose a non-classical parametrization for density estimation using sample moments.
The proposed estimator is the first one in the literature for which the power moments up to an arbitrary even order exactly match the sample moments.
arXiv Detail & Related papers (2022-01-13T04:28:52Z) - Density Ratio Estimation via Infinitesimal Classification [85.08255198145304]
We propose DRE-infty, a divide-and-conquer approach to reduce Density ratio estimation (DRE) to a series of easier subproblems.
Inspired by Monte Carlo methods, we smoothly interpolate between the two distributions via an infinite continuum of intermediate bridge distributions.
We show that our approach performs well on downstream tasks such as mutual information estimation and energy-based modeling on complex, high-dimensional datasets.
arXiv Detail & Related papers (2021-11-22T06:26:29Z) - Heavy-tailed Streaming Statistical Estimation [58.70341336199497]
We consider the task of heavy-tailed statistical estimation given streaming $p$ samples.
We design a clipped gradient descent and provide an improved analysis under a more nuanced condition on the noise of gradients.
arXiv Detail & Related papers (2021-08-25T21:30:27Z) - Denoising Score Matching with Random Fourier Features [11.60130641443281]
We derive analytical expression for the Denoising Score matching using the Kernel Exponential Family as a model distribution.
The obtained expression explicitly depends on the noise variance, so the validation loss can be straightforwardly used to tune the noise level.
arXiv Detail & Related papers (2021-01-13T18:02:39Z) - Nearest Neighbor Dirichlet Mixtures [3.3194866396158]
We propose a class of nearest neighbor-Dirichlet mixtures to maintain most of the strengths of Bayesian approaches without the computational disadvantages.
A simple and embarrassingly parallel Monte Carlo algorithm is proposed to sample from the resulting pseudo-posterior for the unknown density.
arXiv Detail & Related papers (2020-03-17T21:39:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.