Related papers: Exponential Convergence of CAVI for Bayesian PCA

Exponential Convergence of CAVI for Bayesian PCA

URL: http://arxiv.org/abs/2505.16145v1
Date: Thu, 22 May 2025 02:44:00 GMT
Title: Exponential Convergence of CAVI for Bayesian PCA
Authors: Arghya Datta, Philippe Gagnon, Florian Maire,
Abstract summary: Probabilistic principal component analysis (PCA) and its Bayesian variant (BPCA) are widely used for dimension reduction in machine learning and statistics.<n>In our paper, we prove a precise exponential convergence result in the case where the model uses a single principal component (PC)<n>We also leverage recent tools to prove exponential convergence of CAVI for the model with any number of PCs.
Score: 0.7929564340244416
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Probabilistic principal component analysis (PCA) and its Bayesian variant (BPCA) are widely used for dimension reduction in machine learning and statistics. The main advantage of probabilistic PCA over the traditional formulation is allowing uncertainty quantification. The parameters of BPCA are typically learned using mean-field variational inference, and in particular, the coordinate ascent variational inference (CAVI) algorithm. So far, the convergence speed of CAVI for BPCA has not been characterized. In our paper, we fill this gap in the literature. Firstly, we prove a precise exponential convergence result in the case where the model uses a single principal component (PC). Interestingly, this result is established through a connection with the classical $\textit{power iteration algorithm}$ and it indicates that traditional PCA is retrieved as points estimates of the BPCA parameters. Secondly, we leverage recent tools to prove exponential convergence of CAVI for the model with any number of PCs, thus leading to a more general result, but one that is of a slightly different flavor. To prove the latter result, we additionally needed to introduce a novel lower bound for the symmetric Kullback--Leibler divergence between two multivariate normal distributions, which, we believe, is of independent interest in information theory.

Related papers

Bayesian Hierarchical Invariant Prediction [4.182645056052712]
We propose Bayesian Hierarchical Invariant Prediction (BHIP) reframing Invariant Causal Prediction (ICP) through the lens of Hierarchical Bayes.<n>We leverage the hierarchical structure to explicitly test invariance of causal mechanisms under heterogeneous data, resulting in improved computational scalability for a larger number of predictors compared to ICP.<n>In this paper, we test two sparsity inducing priors: horseshoe and spike-and-slab, both of which allow us a more reliable identification of causal features.
arXiv Detail & Related papers (2025-05-16T13:06:25Z)
Asymptotically Unbiased Instance-wise Regularized Partial AUC Optimization: Theory and Algorithm [101.44676036551537]
One-way Partial AUC (OPAUC) and Two-way Partial AUC (TPAUC) measures the average performance of a binary classifier. Most of the existing methods could only optimize PAUC approximately, leading to inevitable biases that are not controllable. We present a simpler reformulation of the PAUC problem via distributional robust optimization AUC.
arXiv Detail & Related papers (2022-10-08T08:26:22Z)
A New Central Limit Theorem for the Augmented IPW Estimator: Variance Inflation, Cross-Fit Covariance and Beyond [0.9172870611255595]
Cross-fit inverse probability weighting (AIPW) with cross-fitting is a popular choice in practice. We study this cross-fit AIPW estimator under well-specified outcome regression and propensity score models in a high-dimensional regime. Our work utilizes a novel interplay between three distinct tools--approximate message passing theory, the theory of deterministic equivalents, and the leave-one-out approach.
arXiv Detail & Related papers (2022-05-20T14:17:53Z)
Capturing the Denoising Effect of PCA via Compression Ratio [3.967854215226183]
Principal component analysis (PCA) is one of the most fundamental tools in machine learning. In this paper, we propose a novel metric called emphcompression ratio to capture the effect of PCA on high-dimensional noisy data. Building on this new metric, we design a straightforward algorithm that could be used to detect outliers.
arXiv Detail & Related papers (2022-04-22T18:43:47Z)
Efficient CDF Approximations for Normalizing Flows [64.60846767084877]
We build upon the diffeomorphic properties of normalizing flows to estimate the cumulative distribution function (CDF) over a closed region. Our experiments on popular flow architectures and UCI datasets show a marked improvement in sample efficiency as compared to traditional estimators.
arXiv Detail & Related papers (2022-02-23T06:11:49Z)
Loss function based second-order Jensen inequality and its application to particle variational inference [112.58907653042317]
Particle variational inference (PVI) uses an ensemble of models as an empirical approximation for the posterior distribution. PVI iteratively updates each model with a repulsion force to ensure the diversity of the optimized models. We derive a novel generalization error bound and show that it can be reduced by enhancing the diversity of models.
arXiv Detail & Related papers (2021-06-09T12:13:51Z)
On the Practicality of Differential Privacy in Federated Learning by Tuning Iteration Times [51.61278695776151]
Federated Learning (FL) is well known for its privacy protection when training machine learning models among distributed clients collaboratively. Recent studies have pointed out that the naive FL is susceptible to gradient leakage attacks. Differential Privacy (DP) emerges as a promising countermeasure to defend against gradient leakage attacks.
arXiv Detail & Related papers (2021-01-11T19:43:12Z)
Probabilistic Contrastive Principal Component Analysis [0.5286651840245514]
We propose a model-based alternative to contrastive principal component analysis ( CPCA) We show PCPCA's advantages over CPCA, including greater interpretability, uncertainty quantification and principled inference. We demonstrate PCPCA's performance through a series of simulations and case-control experiments with datasets of gene expression, protein expression, and images.
arXiv Detail & Related papers (2020-12-14T22:21:50Z)
Improved Dimensionality Reduction of various Datasets using Novel Multiplicative Factoring Principal Component Analysis (MPCA) [0.0]
We present an improvement to the traditional PCA approach called Multiplicative factoring Principal Component Analysis. The advantage of MPCA over the traditional PCA is that a penalty is imposed on the occurrence space through a multiplier to make negligible the effect of outliers in seeking out projections.
arXiv Detail & Related papers (2020-09-25T12:30:15Z)
Approximation Algorithms for Sparse Principal Component Analysis [57.5357874512594]
Principal component analysis (PCA) is a widely used dimension reduction technique in machine learning and statistics. Various approaches to obtain sparse principal direction loadings have been proposed, which are termed Sparse Principal Component Analysis. We present thresholding as a provably accurate, time, approximation algorithm for the SPCA problem.
arXiv Detail & Related papers (2020-06-23T04:25:36Z)
Repulsive Mixture Models of Exponential Family PCA for Clustering [127.90219303669006]
The mixture extension of exponential family principal component analysis ( EPCA) was designed to encode much more structural information about data distribution than the traditional EPCA. The traditional mixture of local EPCAs has the problem of model redundancy, i.e., overlaps among mixing components, which may cause ambiguity for data clustering. In this paper, a repulsiveness-encouraging prior is introduced among mixing components and a diversified EPCA mixture (DEPCAM) model is developed in the Bayesian framework.
arXiv Detail & Related papers (2020-04-07T04:07:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.