Multi-Subspace Matrix Recovery from Permuted Data
- URL: http://arxiv.org/abs/2412.12931v1
- Date: Tue, 17 Dec 2024 14:07:44 GMT
- Title: Multi-Subspace Matrix Recovery from Permuted Data
- Authors: Liangqi Xie, Jicong Fan,
- Abstract summary: We develop a novel four-stage algorithm pipeline including outlier identification, subspace reconstruction, outlier classification, and unsupervised sensing for permuted vector recovery.
Our pipeline is compared with state-of-the-art competitors on multiple benchmarks and shows superior performance.
- Score: 22.653890395053207
- License:
- Abstract: This paper aims to recover a multi-subspace matrix from permuted data: given a matrix, in which the columns are drawn from a union of low-dimensional subspaces and some columns are corrupted by permutations on their entries, recover the original matrix. The task has numerous practical applications such as data cleaning, integration, and de-anonymization, but it remains challenging and cannot be well addressed by existing techniques such as robust principal component analysis because of the presence of multiple subspaces and the permutations on the elements of vectors. To solve the challenge, we develop a novel four-stage algorithm pipeline including outlier identification, subspace reconstruction, outlier classification, and unsupervised sensing for permuted vector recovery. Particularly, we provide theoretical guarantees for the outlier classification step, ensuring reliable multi-subspace matrix recovery. Our pipeline is compared with state-of-the-art competitors on multiple benchmarks and shows superior performance.
Related papers
- Enhanced Latent Multi-view Subspace Clustering [25.343388834470247]
We propose an Enhanced Latent Multi-view Subspace Clustering (ELMSC) method for recovering latent space representation.
Our proposed ELMSC is able to achieve higher clustering performance than some state-of-art multi-view clustering methods.
arXiv Detail & Related papers (2023-12-22T15:28:55Z) - An Efficient Algorithm for Clustered Multi-Task Compressive Sensing [60.70532293880842]
Clustered multi-task compressive sensing is a hierarchical model that solves multiple compressive sensing tasks.
The existing inference algorithm for this model is computationally expensive and does not scale well in high dimensions.
We propose a new algorithm that substantially accelerates model inference by avoiding the need to explicitly compute these covariance matrices.
arXiv Detail & Related papers (2023-09-30T15:57:14Z) - Recovering Simultaneously Structured Data via Non-Convex Iteratively
Reweighted Least Squares [0.8702432681310401]
We propose a new algorithm for recovering data that adheres to multiple, heterogeneous low-dimensional structures from linear observations.
We show that the IRLS method favorable in identifying low/comckuele state measurements.
arXiv Detail & Related papers (2023-06-08T06:35:47Z) - One-sided Matrix Completion from Two Observations Per Row [95.87811229292056]
We propose a natural algorithm that involves imputing the missing values of the matrix $XTX$.
We evaluate our algorithm on one-sided recovery of synthetic data and low-coverage genome sequencing.
arXiv Detail & Related papers (2023-06-06T22:35:16Z) - Learning idempotent representation for subspace clustering [7.6275971668447]
An ideal reconstruction coefficient matrix should have two properties: 1) it is block diagonal with each block indicating a subspace; 2) each block is fully connected.
We devise an idempotent representation (IDR) algorithm to pursue reconstruction coefficient matrices approximating normalized membership matrices.
Experiments conducted on both synthetic and real world datasets prove that IDR is an effective and efficient subspace clustering algorithm.
arXiv Detail & Related papers (2022-07-29T01:39:25Z) - Solving weakly supervised regression problem using low-rank manifold
regularization [77.34726150561087]
We solve a weakly supervised regression problem.
Under "weakly" we understand that for some training points the labels are known, for some unknown, and for others uncertain due to the presence of random noise or other reasons such as lack of resources.
In the numerical section, we applied the suggested method to artificial and real datasets using Monte-Carlo modeling.
arXiv Detail & Related papers (2021-04-13T23:21:01Z) - Unlabeled Principal Component Analysis and Matrix Completion [25.663593359761336]
We introduce robust principal component analysis from a data matrix in which the entries of its columns have been corrupted by permutations.
We derive theory and algorithms of similar flavor for UPCA.
Experiments on synthetic data, face images, educational and medical records reveal the potential of our algorithms for applications such as data privatization and record linkage.
arXiv Detail & Related papers (2021-01-23T07:34:48Z) - Sparse PCA via $l_{2,p}$-Norm Regularization for Unsupervised Feature
Selection [138.97647716793333]
We propose a simple and efficient unsupervised feature selection method, by combining reconstruction error with $l_2,p$-norm regularization.
We present an efficient optimization algorithm to solve the proposed unsupervised model, and analyse the convergence and computational complexity of the algorithm theoretically.
arXiv Detail & Related papers (2020-12-29T04:08:38Z) - Robust Low-rank Matrix Completion via an Alternating Manifold Proximal
Gradient Continuation Method [47.80060761046752]
Robust low-rank matrix completion (RMC) has been studied extensively for computer vision, signal processing and machine learning applications.
This problem aims to decompose a partially observed matrix into the superposition of a low-rank matrix and a sparse matrix, where the sparse matrix captures the grossly corrupted entries of the matrix.
A widely used approach to tackle RMC is to consider a convex formulation, which minimizes the nuclear norm of the low-rank matrix (to promote low-rankness) and the l1 norm of the sparse matrix (to promote sparsity).
In this paper, motivated by some recent works on low-
arXiv Detail & Related papers (2020-08-18T04:46:22Z) - Robust Matrix Completion with Mixed Data Types [0.0]
We consider the problem of recovering a structured low rank matrix with partially observed entries with mixed data types.
Most approaches assume that there is only one underlying distribution and the low rank constraint is regularized by the matrix Schatten Norm.
We propose a computationally feasible statistical approach with strong recovery guarantees along with an algorithmic framework suited for parallelization to recover a low rank matrix with partially observed entries for mixed data types in one step.
arXiv Detail & Related papers (2020-05-25T21:35:10Z) - Granular Computing: An Augmented Scheme of Degranulation Through a
Modified Partition Matrix [86.89353217469754]
Information granules forming an abstract and efficient characterization of large volumes of numeric data have been considered as the fundamental constructs of Granular Computing.
Previous studies have shown that there is a relationship between the reconstruction error and the performance of the granulation process.
To enhance the quality of degranulation, in this study, we develop an augmented scheme through modifying the partition matrix.
arXiv Detail & Related papers (2020-04-03T03:20:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.