A Bayesian approach to learning mixtures of nonparametric components
- URL: http://arxiv.org/abs/2512.12988v1
- Date: Mon, 15 Dec 2025 05:27:01 GMT
- Title: A Bayesian approach to learning mixtures of nonparametric components
- Authors: Yilei Zhang, Yun Wei, Aritra Guha, XuanLong Nguyen,
- Abstract summary: In many applications, making parametric assumptions on the latent subpopulation distributions may be unrealistic.<n>We study finite mixtures with nonparametric mixture components, using a Bayesian nonparametric modeling approach.
- Score: 12.42953971157138
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Mixture models are widely used in modeling heterogeneous data populations. A standard approach of mixture modeling is to assume that the mixture component takes a parametric kernel form, while the flexibility of the model can be obtained by using a large or possibly unbounded number of such parametric kernels. In many applications, making parametric assumptions on the latent subpopulation distributions may be unrealistic, which motivates the need for nonparametric modeling of the mixture components themselves. In this paper we study finite mixtures with nonparametric mixture components, using a Bayesian nonparametric modeling approach. In particular, it is assumed that the data population is generated according to a finite mixture of latent component distributions, where each component is endowed with a Bayesian nonparametric prior such as the Dirichlet process mixture. We present conditions under which the individual mixture component's distributions can be identified, and establish posterior contraction behavior for the data population's density, as well as densities of the latent mixture components. We develop an efficient MCMC algorithm for posterior inference and demonstrate via simulation studies and real-world data illustrations that it is possible to efficiently learn complex distributions for the latent subpopulations. In theory, the posterior contraction rate of the component densities is nearly polynomial, which is a significant improvement over the logarithm convergence rate of estimating mixing measures via deconvolution.
Related papers
- Mixture models for data with unknown distributions [0.6345523830122168]
We describe and analyze a broad class of mixture models for real-valued multivariate data.<n>We return both a division of the data and an estimate of the distributions, effectively performing clustering and density estimation within each cluster at the same time.<n>We demonstrate our methods with a selection of illustrative applications and give code implementing both algorithms.
arXiv Detail & Related papers (2025-02-26T22:42:40Z) - Amortized Bayesian Mixture Models [1.3976439685325095]
This paper introduces a novel extension of Amortized Bayesian Inference (ABI) tailored to mixture models.<n>We factorize the posterior into a distribution of the parameters and a distribution of (categorical) mixture indicators, which allows us to use a combination of generative neural networks.<n>The proposed framework accommodates both independent and dependent mixture models, enabling filtering and smoothing.
arXiv Detail & Related papers (2025-01-17T14:51:03Z) - Collaborative Heterogeneous Causal Inference Beyond Meta-analysis [68.4474531911361]
We propose a collaborative inverse propensity score estimator for causal inference with heterogeneous data.
Our method shows significant improvements over the methods based on meta-analysis when heterogeneity increases.
arXiv Detail & Related papers (2024-04-24T09:04:36Z) - A Robust and Flexible EM Algorithm for Mixtures of Elliptical
Distributions with Missing Data [71.9573352891936]
This paper tackles the problem of missing data imputation for noisy and non-Gaussian data.
A new EM algorithm is investigated for mixtures of elliptical distributions with the property of handling potential missing data.
Experimental results on synthetic data demonstrate that the proposed algorithm is robust to outliers and can be used with non-Gaussian data.
arXiv Detail & Related papers (2022-01-28T10:01:37Z) - Fitting large mixture models using stochastic component selection [0.0]
We propose a combination of the expectation of the computational and the Metropolis-Hastings algorithm to evaluate only a small number of components.
The Markov chain of component assignments is sequentially generated across the algorithm's iterations.
We put emphasis on generality of our method, equipping it with the ability to train both shallow and deep mixture models.
arXiv Detail & Related papers (2021-10-10T12:39:53Z) - A similarity-based Bayesian mixture-of-experts model [0.5156484100374058]
We present a new non-parametric mixture-of-experts model for multivariate regression problems.
Using a conditionally specified model, predictions for out-of-sample inputs are based on similarities to each observed data point.
Posterior inference is performed on the parameters of the mixture as well as the distance metric.
arXiv Detail & Related papers (2020-12-03T18:08:30Z) - Consistent Estimation of Identifiable Nonparametric Mixture Models from
Grouped Observations [84.81435917024983]
This work proposes an algorithm that consistently estimates any identifiable mixture model from grouped observations.
A practical implementation is provided for paired observations, and the approach is shown to outperform existing methods.
arXiv Detail & Related papers (2020-06-12T20:44:22Z) - Uniform Convergence Rates for Maximum Likelihood Estimation under
Two-Component Gaussian Mixture Models [13.769786711365104]
We derive uniform convergence rates for the maximum likelihood estimator and minimax lower bounds for parameter estimation.
We assume the mixing proportions of the mixture are known and fixed, but make no separation assumption on the underlying mixture components.
arXiv Detail & Related papers (2020-06-01T04:13:48Z) - Repulsive Mixture Models of Exponential Family PCA for Clustering [127.90219303669006]
The mixture extension of exponential family principal component analysis ( EPCA) was designed to encode much more structural information about data distribution than the traditional EPCA.
The traditional mixture of local EPCAs has the problem of model redundancy, i.e., overlaps among mixing components, which may cause ambiguity for data clustering.
In this paper, a repulsiveness-encouraging prior is introduced among mixing components and a diversified EPCA mixture (DEPCAM) model is developed in the Bayesian framework.
arXiv Detail & Related papers (2020-04-07T04:07:29Z) - Algebraic and Analytic Approaches for Parameter Learning in Mixture
Models [66.96778152993858]
We present two different approaches for parameter learning in several mixture models in one dimension.
For some of these distributions, our results represent the first guarantees for parameter estimation.
arXiv Detail & Related papers (2020-01-19T05:10:56Z) - Distributed, partially collapsed MCMC for Bayesian Nonparametrics [68.5279360794418]
We exploit the fact that completely random measures, which commonly used models like the Dirichlet process and the beta-Bernoulli process can be expressed as, are decomposable into independent sub-measures.
We use this decomposition to partition the latent measure into a finite measure containing only instantiated components, and an infinite measure containing all other components.
The resulting hybrid algorithm can be applied to allow scalable inference without sacrificing convergence guarantees.
arXiv Detail & Related papers (2020-01-15T23:10:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.