Nearest Neighbor Dirichlet Mixtures
- URL: http://arxiv.org/abs/2003.07953v3
- Date: Thu, 17 Feb 2022 00:11:15 GMT
- Title: Nearest Neighbor Dirichlet Mixtures
- Authors: Shounak Chattopadhyay, Antik Chakraborty, David B. Dunson
- Abstract summary: We propose a class of nearest neighbor-Dirichlet mixtures to maintain most of the strengths of Bayesian approaches without the computational disadvantages.
A simple and embarrassingly parallel Monte Carlo algorithm is proposed to sample from the resulting pseudo-posterior for the unknown density.
- Score: 3.3194866396158
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: There is a rich literature on Bayesian methods for density estimation, which
characterize the unknown density as a mixture of kernels. Such methods have
advantages in terms of providing uncertainty quantification in estimation,
while being adaptive to a rich variety of densities. However, relative to
frequentist locally adaptive kernel methods, Bayesian approaches can be slow
and unstable to implement in relying on Markov chain Monte Carlo algorithms. To
maintain most of the strengths of Bayesian approaches without the computational
disadvantages, we propose a class of nearest neighbor-Dirichlet mixtures. The
approach starts by grouping the data into neighborhoods based on standard
algorithms. Within each neighborhood, the density is characterized via a
Bayesian parametric model, such as a Gaussian with unknown parameters.
Assigning a Dirichlet prior to the weights on these local kernels, we obtain a
pseudo-posterior for the weights and kernel parameters. A simple and
embarrassingly parallel Monte Carlo algorithm is proposed to sample from the
resulting pseudo-posterior for the unknown density. Desirable asymptotic
properties are shown, and the methods are evaluated in simulation studies and
applied to a motivating data set in the context of classification.
Related papers
- A quasi-Bayesian sequential approach to deconvolution density estimation [7.10052009802944]
Density deconvolution addresses the estimation of the unknown density function $f$ of a random signal from data.
We consider the problem of density deconvolution in a streaming or online setting where noisy data arrive progressively.
By relying on a quasi-Bayesian sequential approach, we obtain estimates of $f$ that are of easy evaluation.
arXiv Detail & Related papers (2024-08-26T16:40:04Z) - Sequential transport maps using SoS density estimation and $α$-divergences [0.5999777817331317]
Transport-based density estimation methods are receiving growing interest because of their ability to efficiently generate samples from the approximated density.
We provide a new convergence analyses of the sequential transport maps based on information geometric properties of $alpha$-divergences.
We numerically demonstrate our methods on Bayesian inference problems and unsupervised learning tasks.
arXiv Detail & Related papers (2024-02-27T23:52:58Z) - Sobolev Space Regularised Pre Density Models [51.558848491038916]
We propose a new approach to non-parametric density estimation that is based on regularizing a Sobolev norm of the density.
This method is statistically consistent, and makes the inductive validation model clear and consistent.
arXiv Detail & Related papers (2023-07-25T18:47:53Z) - Robust Inference of Manifold Density and Geometry by Doubly Stochastic
Scaling [8.271859911016719]
We develop tools for robust inference under high-dimensional noise.
We show that our approach is robust to variability in technical noise levels across cell types.
arXiv Detail & Related papers (2022-09-16T15:39:11Z) - A Robust and Flexible EM Algorithm for Mixtures of Elliptical
Distributions with Missing Data [71.9573352891936]
This paper tackles the problem of missing data imputation for noisy and non-Gaussian data.
A new EM algorithm is investigated for mixtures of elliptical distributions with the property of handling potential missing data.
Experimental results on synthetic data demonstrate that the proposed algorithm is robust to outliers and can be used with non-Gaussian data.
arXiv Detail & Related papers (2022-01-28T10:01:37Z) - Density Ratio Estimation via Infinitesimal Classification [85.08255198145304]
We propose DRE-infty, a divide-and-conquer approach to reduce Density ratio estimation (DRE) to a series of easier subproblems.
Inspired by Monte Carlo methods, we smoothly interpolate between the two distributions via an infinite continuum of intermediate bridge distributions.
We show that our approach performs well on downstream tasks such as mutual information estimation and energy-based modeling on complex, high-dimensional datasets.
arXiv Detail & Related papers (2021-11-22T06:26:29Z) - Density-Based Clustering with Kernel Diffusion [59.4179549482505]
A naive density corresponding to the indicator function of a unit $d$-dimensional Euclidean ball is commonly used in density-based clustering algorithms.
We propose a new kernel diffusion density function, which is adaptive to data of varying local distributional characteristics and smoothness.
arXiv Detail & Related papers (2021-10-11T09:00:33Z) - Generative Learning With Euler Particle Transport [14.557451744544592]
We propose an Euler particle transport (EPT) approach for generative learning.
The proposed approach is motivated by the problem of finding an optimal transport map from a reference distribution to a target distribution.
We show that the proposed density-ratio (difference) estimators do not suffer from the "curse of dimensionality" if data is supported on a lower-dimensional manifold.
arXiv Detail & Related papers (2020-12-11T03:10:53Z) - Plug-And-Play Learned Gaussian-mixture Approximate Message Passing [71.74028918819046]
We propose a plug-and-play compressed sensing (CS) recovery algorithm suitable for any i.i.d. source prior.
Our algorithm builds upon Borgerding's learned AMP (LAMP), yet significantly improves it by adopting a universal denoising function within the algorithm.
Numerical evaluation shows that the L-GM-AMP algorithm achieves state-of-the-art performance without any knowledge of the source prior.
arXiv Detail & Related papers (2020-11-18T16:40:45Z) - Pathwise Conditioning of Gaussian Processes [72.61885354624604]
Conventional approaches for simulating Gaussian process posteriors view samples as draws from marginal distributions of process values at finite sets of input locations.
This distribution-centric characterization leads to generative strategies that scale cubically in the size of the desired random vector.
We show how this pathwise interpretation of conditioning gives rise to a general family of approximations that lend themselves to efficiently sampling Gaussian process posteriors.
arXiv Detail & Related papers (2020-11-08T17:09:37Z) - A fast and efficient Modal EM algorithm for Gaussian mixtures [0.0]
In the modal approach to clustering, clusters are defined as the local maxima of the underlying probability density function.
The Modal EM algorithm is an iterative procedure that can identify the local maxima of any density function.
arXiv Detail & Related papers (2020-02-10T08:34:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.