Hidden Convexity of Fair PCA and Fast Solver via Eigenvalue Optimization
- URL: http://arxiv.org/abs/2503.00299v1
- Date: Sat, 01 Mar 2025 02:13:20 GMT
- Title: Hidden Convexity of Fair PCA and Fast Solver via Eigenvalue Optimization
- Authors: Junhui Shen, Aaron J. Davis, Ding Lu, Zhaojun Bai,
- Abstract summary: Principal Component Analysis (PCA) is a technique in machine learning for dimensionality reduction of high-dimensional datasets.<n>The Fair (FPCA) model was introduced by Samadi et al. to equalize the reconstruction loss between subgroups.<n>The semidefinite relaxation (SDR) based approach proposed by Samadi et al. is computationally expensive even for suboptimal solutions.<n>In this paper, we identify a hidden convexity in the FPCA model and introduce an algorithm for convex optimization via eigenvalue optimization.
- Score: 1.4999444543328293
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Principal Component Analysis (PCA) is a foundational technique in machine learning for dimensionality reduction of high-dimensional datasets. However, PCA could lead to biased outcomes that disadvantage certain subgroups of the underlying datasets. To address the bias issue, a Fair PCA (FPCA) model was introduced by Samadi et al. (2018) for equalizing the reconstruction loss between subgroups. The semidefinite relaxation (SDR) based approach proposed by Samadi et al. (2018) is computationally expensive even for suboptimal solutions. To improve efficiency, several alternative variants of the FPCA model have been developed. These variants often shift the focus away from equalizing the reconstruction loss. In this paper, we identify a hidden convexity in the FPCA model and introduce an algorithm for convex optimization via eigenvalue optimization. Our approach achieves the desired fairness in reconstruction loss without sacrificing performance. As demonstrated in real-world datasets, the proposed FPCA algorithm runs $8\times$ faster than the SDR-based algorithm, and only at most 85% slower than the standard PCA.
Related papers
- Correcting the Mythos of KL-Regularization: Direct Alignment without Overoptimization via Chi-Squared Preference Optimization [78.82586283794886]
$chi2$-Preference Optimization ($chi$PO) is an efficient offline alignment algorithm provably robust to overoptimization.
$chi$PO implements the principle of pessimism in the face of uncertainty via regularization.
$chi$PO's simplicity and strong guarantees make it the first practical and general-purpose offline alignment algorithm provably robust to overoptimization.
arXiv Detail & Related papers (2024-07-18T11:08:40Z) - Provably Mitigating Overoptimization in RLHF: Your SFT Loss is Implicitly an Adversarial Regularizer [52.09480867526656]
We identify the source of misalignment as a form of distributional shift and uncertainty in learning human preferences.<n>To mitigate overoptimization, we first propose a theoretical algorithm that chooses the best policy for an adversarially chosen reward model.<n>Using the equivalence between reward models and the corresponding optimal policy, the algorithm features a simple objective that combines a preference optimization loss and a supervised learning loss.
arXiv Detail & Related papers (2024-05-26T05:38:50Z) - Improved Privacy-Preserving PCA Using Optimized Homomorphic Matrix
Multiplication [0.0]
Principal Component Analysis (PCA) is a pivotal technique widely utilized in the realms of machine learning and data analysis.
In recent years, there have been endeavors to utilize homomorphic encryption in privacy-preserving PCA algorithms for the secure cloud computing scenario.
We propose a novel approach to privacy-preserving PCA that addresses these limitations, resulting in superior efficiency, accuracy, and scalability compared to previous approaches.
arXiv Detail & Related papers (2023-05-27T02:51:20Z) - Fair principal component analysis (PCA): minorization-maximization
algorithms for Fair PCA, Fair Robust PCA and Fair Sparse PCA [6.974999794070285]
We propose a new iterative algorithm to solve the fair PCA (FPCA) problem.
The proposed algorithm relies on the relaxation of a semi-orthogonality constraint which is proved to be tight at every iteration of the algorithm.
We numerically compare the performance of the proposed methods with two of the state-of-the-art approaches on synthetic data sets and a real-life data set.
arXiv Detail & Related papers (2023-05-10T08:14:32Z) - Exploring the Algorithm-Dependent Generalization of AUPRC Optimization
with List Stability [107.65337427333064]
optimization of the Area Under the Precision-Recall Curve (AUPRC) is a crucial problem for machine learning.
In this work, we present the first trial in the single-dependent generalization of AUPRC optimization.
Experiments on three image retrieval datasets on speak to the effectiveness and soundness of our framework.
arXiv Detail & Related papers (2022-09-27T09:06:37Z) - A novel approach for Fair Principal Component Analysis based on
eigendecomposition [10.203602318836444]
We propose a novel PCA algorithm which tackles fairness issues by means of a simple strategy comprising a one-dimensional search.
Our findings are consistent in several real situations as well as in scenarios with both unbalanced and balanced datasets.
arXiv Detail & Related papers (2022-08-24T08:20:16Z) - Large-scale Optimization of Partial AUC in a Range of False Positive
Rates [51.12047280149546]
The area under the ROC curve (AUC) is one of the most widely used performance measures for classification models in machine learning.
We develop an efficient approximated gradient descent method based on recent practical envelope smoothing technique.
Our proposed algorithm can also be used to minimize the sum of some ranked range loss, which also lacks efficient solvers.
arXiv Detail & Related papers (2022-03-03T03:46:18Z) - Stochastic Optimization of Areas Under Precision-Recall Curves with
Provable Convergence [66.83161885378192]
Area under ROC (AUROC) and precision-recall curves (AUPRC) are common metrics for evaluating classification performance for imbalanced problems.
We propose a technical method to optimize AUPRC for deep learning.
arXiv Detail & Related papers (2021-04-18T06:22:21Z) - Doubly Robust Off-Policy Actor-Critic: Convergence and Optimality [131.45028999325797]
We develop a doubly robust off-policy AC (DR-Off-PAC) for discounted MDP.
DR-Off-PAC adopts a single timescale structure, in which both actor and critics are updated simultaneously with constant stepsize.
We study the finite-time convergence rate and characterize the sample complexity for DR-Off-PAC to attain an $epsilon$-accurate optimal policy.
arXiv Detail & Related papers (2021-02-23T18:56:13Z) - Principal Ellipsoid Analysis (PEA): Efficient non-linear dimension
reduction & clustering [9.042239247913642]
This article focuses on improving upon PCA and k-means, by allowing nonlinear relations in the data and more flexible cluster shapes.
The key contribution is a new framework for Principal Analysis (PEA), defining a simple and computationally efficient alternative to PCA.
In a rich variety of real data clustering applications, PEA is shown to do as well as k-means for simple datasets, while dramatically improving performance in more complex settings.
arXiv Detail & Related papers (2020-08-17T06:25:50Z) - Approximation Algorithms for Sparse Principal Component Analysis [57.5357874512594]
Principal component analysis (PCA) is a widely used dimension reduction technique in machine learning and statistics.
Various approaches to obtain sparse principal direction loadings have been proposed, which are termed Sparse Principal Component Analysis.
We present thresholding as a provably accurate, time, approximation algorithm for the SPCA problem.
arXiv Detail & Related papers (2020-06-23T04:25:36Z) - Fair Principal Component Analysis and Filter Design [2.66512000865131]
We search for a low dimensional subspace that spans multiple target vectors in a fair manner.
We analyze the landscape of the underlying optimization in the case of targets.
We prove that the landscape is benign and that all local minima are globally optimal.
arXiv Detail & Related papers (2020-02-16T11:42:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.