Unlabeled Principal Component Analysis and Matrix Completion
- URL: http://arxiv.org/abs/2101.09446v2
- Date: Mon, 9 Oct 2023 07:23:59 GMT
- Title: Unlabeled Principal Component Analysis and Matrix Completion
- Authors: Yunzhen Yao, Liangzu Peng and Manolis C. Tsakiris
- Abstract summary: We introduce robust principal component analysis from a data matrix in which the entries of its columns have been corrupted by permutations.
We derive theory and algorithms of similar flavor for UPCA.
Experiments on synthetic data, face images, educational and medical records reveal the potential of our algorithms for applications such as data privatization and record linkage.
- Score: 25.663593359761336
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We introduce robust principal component analysis from a data matrix in which
the entries of its columns have been corrupted by permutations, termed
Unlabeled Principal Component Analysis (UPCA). Using algebraic geometry, we
establish that UPCA is a well-defined algebraic problem in the sense that the
only matrices of minimal rank that agree with the given data are
row-permutations of the ground-truth matrix, arising as the unique solutions of
a polynomial system of equations. Further, we propose an efficient two-stage
algorithmic pipeline for UPCA suitable for the practically relevant case where
only a fraction of the data have been permuted. Stage-I employs outlier-robust
PCA methods to estimate the ground-truth column-space. Equipped with the
column-space, Stage-II applies recent methods for unlabeled sensing to restore
the permuted data. Allowing for missing entries on top of permutations in UPCA
leads to the problem of unlabeled matrix completion, for which we derive theory
and algorithms of similar flavor. Experiments on synthetic data, face images,
educational and medical records reveal the potential of our algorithms for
applications such as data privatization and record linkage.
Related papers
- Synergistic eigenanalysis of covariance and Hessian matrices for
enhanced binary classification [75.90957645766676]
We present a novel approach that combines the eigenanalysis of a covariance matrix evaluated on a training set with a Hessian matrix evaluated on a deep learning model.
Our approach is substantiated by formal proofs that establish its capability to maximize between-class mean distance and minimize within-class variances.
arXiv Detail & Related papers (2024-02-14T16:10:42Z) - Weakly supervised covariance matrices alignment through Stiefel matrices
estimation for MEG applications [64.20396555814513]
This paper introduces a novel domain adaptation technique for time series data, called Mixing model Stiefel Adaptation (MSA)
We exploit abundant unlabeled data in the target domain to ensure effective prediction by establishing pairwise correspondence with equivalent signal variances between domains.
MSA outperforms recent methods in brain-age regression with task variations using magnetoencephalography (MEG) signals from the Cam-CAN dataset.
arXiv Detail & Related papers (2024-01-24T19:04:49Z) - Spectral Entry-wise Matrix Estimation for Low-Rank Reinforcement
Learning [53.445068584013896]
We study matrix estimation problems arising in reinforcement learning (RL) with low-rank structure.
In low-rank bandits, the matrix to be recovered specifies the expected arm rewards, and for low-rank Markov Decision Processes (MDPs), it may for example characterize the transition kernel of the MDP.
We show that simple spectral-based matrix estimation approaches efficiently recover the singular subspaces of the matrix and exhibit nearly-minimal entry-wise error.
arXiv Detail & Related papers (2023-10-10T17:06:41Z) - A Perceptron-based Fine Approximation Technique for Linear Separation [0.0]
This paper presents a novel online learning method that aims at finding a separator hyperplane between data points labelled as either positive or negative.
Weights and biases of artificial neurons can directly be related to hyperplanes in high-dimensional spaces.
The presented method has proven converge; empirical results show that it can be more efficient than the Perceptron algorithm.
arXiv Detail & Related papers (2023-09-12T08:35:24Z) - Recovering Simultaneously Structured Data via Non-Convex Iteratively
Reweighted Least Squares [0.8702432681310401]
We propose a new algorithm for recovering data that adheres to multiple, heterogeneous low-dimensional structures from linear observations.
We show that the IRLS method favorable in identifying low/comckuele state measurements.
arXiv Detail & Related papers (2023-06-08T06:35:47Z) - Covariance matrix preparation for quantum principal component analysis [0.8258451067861933]
Principal component analysis (PCA) is a dimensionality reduction method in data analysis.
Quantum algorithms have been formulated for PCA based on diagonalizing a density matrix.
We numerically implement our method for molecular ground-state datasets.
arXiv Detail & Related papers (2022-04-07T15:11:42Z) - Solving weakly supervised regression problem using low-rank manifold
regularization [77.34726150561087]
We solve a weakly supervised regression problem.
Under "weakly" we understand that for some training points the labels are known, for some unknown, and for others uncertain due to the presence of random noise or other reasons such as lack of resources.
In the numerical section, we applied the suggested method to artificial and real datasets using Monte-Carlo modeling.
arXiv Detail & Related papers (2021-04-13T23:21:01Z) - Sparse PCA via $l_{2,p}$-Norm Regularization for Unsupervised Feature
Selection [138.97647716793333]
We propose a simple and efficient unsupervised feature selection method, by combining reconstruction error with $l_2,p$-norm regularization.
We present an efficient optimization algorithm to solve the proposed unsupervised model, and analyse the convergence and computational complexity of the algorithm theoretically.
arXiv Detail & Related papers (2020-12-29T04:08:38Z) - Semi-Supervised Learning with Meta-Gradient [123.26748223837802]
We propose a simple yet effective meta-learning algorithm in semi-supervised learning.
We find that the proposed algorithm performs favorably against state-of-the-art methods.
arXiv Detail & Related papers (2020-07-08T08:48:56Z) - Robust Matrix Completion with Mixed Data Types [0.0]
We consider the problem of recovering a structured low rank matrix with partially observed entries with mixed data types.
Most approaches assume that there is only one underlying distribution and the low rank constraint is regularized by the matrix Schatten Norm.
We propose a computationally feasible statistical approach with strong recovery guarantees along with an algorithmic framework suited for parallelization to recover a low rank matrix with partially observed entries for mixed data types in one step.
arXiv Detail & Related papers (2020-05-25T21:35:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.