Empirical Bayes Linked Matrix Decomposition
- URL: http://arxiv.org/abs/2408.00237v1
- Date: Thu, 1 Aug 2024 02:13:11 GMT
- Title: Empirical Bayes Linked Matrix Decomposition
- Authors: Eric F. Lock,
- Abstract summary: We propose an empirical variational Bayesian approach to this problem.
We describe an associated iterative imputation approach that is novel for the single-matrix context.
We show that the method performs very well under different scenarios with respect to recovering underlying low-rank signal.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Data for several applications in diverse fields can be represented as multiple matrices that are linked across rows or columns. This is particularly common in molecular biomedical research, in which multiple molecular "omics" technologies may capture different feature sets (e.g., corresponding to rows in a matrix) and/or different sample populations (corresponding to columns). This has motivated a large body of work on integrative matrix factorization approaches that identify and decompose low-dimensional signal that is shared across multiple matrices or specific to a given matrix. We propose an empirical variational Bayesian approach to this problem that has several advantages over existing techniques, including the flexibility to accommodate shared signal over any number of row or column sets (i.e., bidimensional integration), an intuitive model-based objective function that yields appropriate shrinkage for the inferred signals, and a relatively efficient estimation algorithm with no tuning parameters. A general result establishes conditions for the uniqueness of the underlying decomposition for a broad family of methods that includes the proposed approach. For scenarios with missing data, we describe an associated iterative imputation approach that is novel for the single-matrix context and a powerful approach for "blockwise" imputation (in which an entire row or column is missing) in various linked matrix contexts. Extensive simulations show that the method performs very well under different scenarios with respect to recovering underlying low-rank signal, accurately decomposing shared and specific signals, and accurately imputing missing data. The approach is applied to gene expression and miRNA data from breast cancer tissue and normal breast tissue, for which it gives an informative decomposition of variation and outperforms alternative strategies for missing data imputation.
Related papers
- Weakly supervised covariance matrices alignment through Stiefel matrices
estimation for MEG applications [64.20396555814513]
This paper introduces a novel domain adaptation technique for time series data, called Mixing model Stiefel Adaptation (MSA)
We exploit abundant unlabeled data in the target domain to ensure effective prediction by establishing pairwise correspondence with equivalent signal variances between domains.
MSA outperforms recent methods in brain-age regression with task variations using magnetoencephalography (MEG) signals from the Cam-CAN dataset.
arXiv Detail & Related papers (2024-01-24T19:04:49Z) - Clustering Three-Way Data with Outliers [1.0435741631709405]
An approach for clustering matrix-variate normal data with outliers is discussed.
The approach, which uses the distribution of subset log-likelihoods, extends the OCLUST algorithm and uses an iterative approach to detect and trim outliers.
arXiv Detail & Related papers (2023-10-08T21:27:29Z) - An Efficient Algorithm for Clustered Multi-Task Compressive Sensing [60.70532293880842]
Clustered multi-task compressive sensing is a hierarchical model that solves multiple compressive sensing tasks.
The existing inference algorithm for this model is computationally expensive and does not scale well in high dimensions.
We propose a new algorithm that substantially accelerates model inference by avoiding the need to explicitly compute these covariance matrices.
arXiv Detail & Related papers (2023-09-30T15:57:14Z) - Multiple Augmented Reduced Rank Regression for Pan-Cancer Analysis [0.0]
We propose multiple augmented reduced rank regression (maRRR), a flexible matrix regression and factorization method.
We consider a structured nuclear norm objective that is motivated by random matrix theory.
We apply maRRR to gene expression data from multiple cancer types (i.e., pan-cancer) from TCGA.
arXiv Detail & Related papers (2023-08-30T21:40:58Z) - Multi-modal Multi-view Clustering based on Non-negative Matrix
Factorization [0.0]
We propose a study on multi-modal clustering algorithms and present a novel method called multi-modal multi-view non-negative matrix factorization.
The experimental results show the value of the proposed approach, which was evaluated using a variety of data sets.
arXiv Detail & Related papers (2023-08-09T08:06:03Z) - Classification of BCI-EEG based on augmented covariance matrix [0.0]
We propose a new framework based on the augmented covariance extracted from an autoregressive model to improve motor imagery classification.
We will test our approach on several datasets and several subjects using the MOABB framework.
arXiv Detail & Related papers (2023-02-09T09:04:25Z) - Learning Graphical Factor Models with Riemannian Optimization [70.13748170371889]
This paper proposes a flexible algorithmic framework for graph learning under low-rank structural constraints.
The problem is expressed as penalized maximum likelihood estimation of an elliptical distribution.
We leverage geometries of positive definite matrices and positive semi-definite matrices of fixed rank that are well suited to elliptical models.
arXiv Detail & Related papers (2022-10-21T13:19:45Z) - Solving weakly supervised regression problem using low-rank manifold
regularization [77.34726150561087]
We solve a weakly supervised regression problem.
Under "weakly" we understand that for some training points the labels are known, for some unknown, and for others uncertain due to the presence of random noise or other reasons such as lack of resources.
In the numerical section, we applied the suggested method to artificial and real datasets using Monte-Carlo modeling.
arXiv Detail & Related papers (2021-04-13T23:21:01Z) - Sparse PCA via $l_{2,p}$-Norm Regularization for Unsupervised Feature
Selection [138.97647716793333]
We propose a simple and efficient unsupervised feature selection method, by combining reconstruction error with $l_2,p$-norm regularization.
We present an efficient optimization algorithm to solve the proposed unsupervised model, and analyse the convergence and computational complexity of the algorithm theoretically.
arXiv Detail & Related papers (2020-12-29T04:08:38Z) - Conjoined Dirichlet Process [63.89763375457853]
We develop a novel, non-parametric probabilistic biclustering method based on Dirichlet processes to identify biclusters with strong co-occurrence in both rows and columns.
We apply our method to two different applications, text mining and gene expression analysis, and demonstrate that our method improves bicluster extraction in many settings compared to existing approaches.
arXiv Detail & Related papers (2020-02-08T19:41:23Z) - Bidimensional linked matrix factorization for pan-omics pan-cancer
analysis [0.802904964931021]
We propose a flexible approach to the simultaneous factorization and decomposition of variation across bidimensionally linked matrices, BIDIFAC+.
This decomposes variation into a series of low-rank components that may be shared across any number of row sets or column sets.
We apply BIDIFAC+ to pan-omics pan-cancer data from TCGA, identifying shared and specific modes of variability across 4 different omics platforms and 29 different cancer types.
arXiv Detail & Related papers (2020-02-07T03:11:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.