HeMPPCAT: Mixtures of Probabilistic Principal Component Analysers for
Data with Heteroscedastic Noise
- URL: http://arxiv.org/abs/2301.08852v1
- Date: Sat, 21 Jan 2023 02:00:55 GMT
- Title: HeMPPCAT: Mixtures of Probabilistic Principal Component Analysers for
Data with Heteroscedastic Noise
- Authors: Alec S. Xu, Laura Balzano, Jeffrey A. Fessler
- Abstract summary: MPPCA assumes the data samples in each mixture contain homoscedastic noise.
The performance of MPPCA is suboptimal for data with heteroscedastic noise across samples.
This paper proposes a heteroscedastic mixtures of probabilistic PCA technique (HeMPPCAT) that uses a generalized expectation-maximization (GEM) algorithm.
- Score: 28.24679019484073
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Mixtures of probabilistic principal component analysis (MPPCA) is a
well-known mixture model extension of principal component analysis (PCA).
Similar to PCA, MPPCA assumes the data samples in each mixture contain
homoscedastic noise. However, datasets with heterogeneous noise across samples
are becoming increasingly common, as larger datasets are generated by
collecting samples from several sources with varying noise profiles. The
performance of MPPCA is suboptimal for data with heteroscedastic noise across
samples. This paper proposes a heteroscedastic mixtures of probabilistic PCA
technique (HeMPPCAT) that uses a generalized expectation-maximization (GEM)
algorithm to jointly estimate the unknown underlying factors, means, and noise
variances under a heteroscedastic noise setting. Simulation results illustrate
the improved factor estimates and clustering accuracies of HeMPPCAT compared to
MPPCA.
Related papers
- On the Estimation Performance of Generalized Power Method for
Heteroscedastic Probabilistic PCA [21.9585534723895]
We show that, given a suitable iterate between the GPMs generated by at least geometrically bound to some threshold, the GPMs decrease to some threshold with the residual part of certain "-resi decomposition"
In this paper, we demonstrate the superior performance with sub-Gaussian noise settings using the PCA technique.
arXiv Detail & Related papers (2023-12-06T11:41:17Z) - ALPCAH: Sample-wise Heteroscedastic PCA with Tail Singular Value
Regularization [17.771454131646312]
Principal component analysis is a key tool in the field of data dimensionality reduction.
This paper develops a PCA method that can estimate the sample-wise noise variances.
It is done without distributional assumptions of the low-rank component and without assuming the noise variances are known.
arXiv Detail & Related papers (2023-07-06T03:11:11Z) - Probabilistic Conformal Prediction Using Conditional Random Samples [73.26753677005331]
PCP is a predictive inference algorithm that estimates a target variable by a discontinuous predictive set.
It is efficient and compatible with either explicit or implicit conditional generative models.
arXiv Detail & Related papers (2022-06-14T03:58:03Z) - A Robust and Flexible EM Algorithm for Mixtures of Elliptical
Distributions with Missing Data [71.9573352891936]
This paper tackles the problem of missing data imputation for noisy and non-Gaussian data.
A new EM algorithm is investigated for mixtures of elliptical distributions with the property of handling potential missing data.
Experimental results on synthetic data demonstrate that the proposed algorithm is robust to outliers and can be used with non-Gaussian data.
arXiv Detail & Related papers (2022-01-28T10:01:37Z) - Shared Independent Component Analysis for Multi-Subject Neuroimaging [107.29179765643042]
We introduce Shared Independent Component Analysis (ShICA) that models each view as a linear transform of shared independent components contaminated by additive Gaussian noise.
We show that this model is identifiable if the components are either non-Gaussian or have enough diversity in noise variances.
We provide empirical evidence on fMRI and MEG datasets that ShICA yields more accurate estimation of the components than alternatives.
arXiv Detail & Related papers (2021-10-26T08:54:41Z) - Noise-Resistant Deep Metric Learning with Probabilistic Instance
Filtering [59.286567680389766]
Noisy labels are commonly found in real-world data, which cause performance degradation of deep neural networks.
We propose Probabilistic Ranking-based Instance Selection with Memory (PRISM) approach for DML.
PRISM calculates the probability of a label being clean, and filters out potentially noisy samples.
arXiv Detail & Related papers (2021-08-03T12:15:25Z) - PCA Initialization for Approximate Message Passing in Rotationally
Invariant Models [29.039655256171088]
Principal Component Analysis provides a natural estimator, and sharp results on its performance have been obtained in the high-dimensional regime.
Recently, an Approximate Message Passing (AMP) algorithm has been proposed as an alternative estimator with the potential to improve the accuracy of PCA.
In this work, we combine the two methods, initialize AMP with PCA, and propose a rigorous characterization of the performance of this estimator.
arXiv Detail & Related papers (2021-06-04T09:13:51Z) - Adaptive Multi-View ICA: Estimation of noise levels for optimal
inference [65.94843987207445]
Adaptive multiView ICA (AVICA) is a noisy ICA model where each view is a linear mixture of shared independent sources with additive noise on the sources.
On synthetic data, AVICA yields better sources estimates than other group ICA methods thanks to its explicit MMSE estimator.
On real magnetoencephalograpy (MEG) data, we provide evidence that the decomposition is less sensitive to sampling noise and that the noise variance estimates are biologically plausible.
arXiv Detail & Related papers (2021-02-22T13:10:12Z) - Empirical Bayes PCA in high dimensions [11.806200054814772]
Principal Components Analysis is known to exhibit problematic phenomena of high-dimensional noise.
We propose an Empirical Bayes PCA method that reduces this noise by estimating a structural prior for the joint distributions of the principal components.
arXiv Detail & Related papers (2020-12-21T20:43:44Z) - Probabilistic Contrastive Principal Component Analysis [0.5286651840245514]
We propose a model-based alternative to contrastive principal component analysis ( CPCA)
We show PCPCA's advantages over CPCA, including greater interpretability, uncertainty quantification and principled inference.
We demonstrate PCPCA's performance through a series of simulations and case-control experiments with datasets of gene expression, protein expression, and images.
arXiv Detail & Related papers (2020-12-14T22:21:50Z) - Repulsive Mixture Models of Exponential Family PCA for Clustering [127.90219303669006]
The mixture extension of exponential family principal component analysis ( EPCA) was designed to encode much more structural information about data distribution than the traditional EPCA.
The traditional mixture of local EPCAs has the problem of model redundancy, i.e., overlaps among mixing components, which may cause ambiguity for data clustering.
In this paper, a repulsiveness-encouraging prior is introduced among mixing components and a diversified EPCA mixture (DEPCAM) model is developed in the Bayesian framework.
arXiv Detail & Related papers (2020-04-07T04:07:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.