Robust Bayesian Nonnegative Matrix Factorization with Implicit
Regularizers
- URL: http://arxiv.org/abs/2208.10053v1
- Date: Mon, 22 Aug 2022 04:34:17 GMT
- Title: Robust Bayesian Nonnegative Matrix Factorization with Implicit
Regularizers
- Authors: Jun Lu, Christine P. Chai
- Abstract summary: We introduce a probabilistic model with implicit norm regularization for learning nonnegative matrix factorization (NMF)
We evaluate the model on several real-world datasets including Genomics of Drug Sensitivity in Cancer.
- Score: 4.913248451323163
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We introduce a probabilistic model with implicit norm regularization for
learning nonnegative matrix factorization (NMF) that is commonly used for
predicting missing values and finding hidden patterns in the data, in which the
matrix factors are latent variables associated with each data dimension. The
nonnegativity constraint for the latent factors is handled by choosing priors
with support on the nonnegative subspace, e.g., exponential density or
distribution based on exponential function. Bayesian inference procedure based
on Gibbs sampling is employed. We evaluate the model on several real-world
datasets including Genomics of Drug Sensitivity in Cancer (GDSC $IC_{50}$) and
Gene body methylation with different sizes and dimensions, and show that the
proposed Bayesian NMF GL$_2^2$ and GL$_\infty$ models lead to robust
predictions for different data values and avoid overfitting compared with
competitive Bayesian NMF approaches.
Related papers
- A Graphical Model for Fusing Diverse Microbiome Data [2.385985842958366]
We introduce a flexible multinomial-Gaussian generative model for jointly modeling such count data.
We present a computationally scalable variational Expectation-Maximization (EM) algorithm for inferring the latent variables and the parameters of the model.
arXiv Detail & Related papers (2022-08-21T17:54:39Z) - Comparative Study of Inference Methods for Interpolative Decomposition [4.913248451323163]
We propose a probabilistic model with automatic relevance determination (ARD) for learning interpolative decomposition (ID)
We evaluate the model on a variety of real-world datasets including CCLE $EC50$, CCLE $IC50$, Gene Body Methylation, and Promoter Methylation datasets with different sizes, and dimensions.
arXiv Detail & Related papers (2022-06-29T11:37:05Z) - Bayesian Low-Rank Interpolative Decomposition for Complex Datasets [4.913248451323163]
We introduce a probabilistic model for learning interpolative decomposition (ID), which is commonly used for feature selection, low-rank approximation, and identifying hidden patterns in data.
We evaluate the model on a variety of real-world datasets including CCLE EC50, CCLE IC50, CTRP EC50,and MovieLens 100K datasets with different sizes, and dimensions.
arXiv Detail & Related papers (2022-05-30T03:06:48Z) - Flexible and Hierarchical Prior for Bayesian Nonnegative Matrix
Factorization [4.913248451323163]
We introduce a probabilistic model for learning nonnegative matrix factorization (NMF)
We evaluate the model on several real-world datasets including MovieLens 100K and MovieLens 1M with different sizes and dimensions.
arXiv Detail & Related papers (2022-05-23T03:51:55Z) - Log-based Sparse Nonnegative Matrix Factorization for Data
Representation [55.72494900138061]
Nonnegative matrix factorization (NMF) has been widely studied in recent years due to its effectiveness in representing nonnegative data with parts-based representations.
We propose a new NMF method with log-norm imposed on the factor matrices to enhance the sparseness.
A novel column-wisely sparse norm, named $ell_2,log$-(pseudo) norm, is proposed to enhance the robustness of the proposed method.
arXiv Detail & Related papers (2022-04-22T11:38:10Z) - Optimal regularizations for data generation with probabilistic graphical
models [0.0]
Empirically, well-chosen regularization schemes dramatically improve the quality of the inferred models.
We consider the particular case of L 2 and L 1 regularizations in the Maximum A Posteriori (MAP) inference of generative pairwise graphical models.
arXiv Detail & Related papers (2021-12-02T14:45:16Z) - Entropy Minimizing Matrix Factorization [102.26446204624885]
Nonnegative Matrix Factorization (NMF) is a widely-used data analysis technique, and has yielded impressive results in many real-world tasks.
In this study, an Entropy Minimizing Matrix Factorization framework (EMMF) is developed to tackle the above problem.
Considering that the outliers are usually much less than the normal samples, a new entropy loss function is established for matrix factorization.
arXiv Detail & Related papers (2021-03-24T21:08:43Z) - Accounting for Unobserved Confounding in Domain Generalization [107.0464488046289]
This paper investigates the problem of learning robust, generalizable prediction models from a combination of datasets.
Part of the challenge of learning robust models lies in the influence of unobserved confounders.
We demonstrate the empirical performance of our approach on healthcare data from different modalities.
arXiv Detail & Related papers (2020-07-21T08:18:06Z) - Unlabelled Data Improves Bayesian Uncertainty Calibration under
Covariate Shift [100.52588638477862]
We develop an approximate Bayesian inference scheme based on posterior regularisation.
We demonstrate the utility of our method in the context of transferring prognostic models of prostate cancer across globally diverse populations.
arXiv Detail & Related papers (2020-06-26T13:50:19Z) - Slice Sampling for General Completely Random Measures [74.24975039689893]
We present a novel Markov chain Monte Carlo algorithm for posterior inference that adaptively sets the truncation level using auxiliary slice variables.
The efficacy of the proposed algorithm is evaluated on several popular nonparametric models.
arXiv Detail & Related papers (2020-06-24T17:53:53Z) - Asymptotic Analysis of an Ensemble of Randomly Projected Linear
Discriminants [94.46276668068327]
In [1], an ensemble of randomly projected linear discriminants is used to classify datasets.
We develop a consistent estimator of the misclassification probability as an alternative to the computationally-costly cross-validation estimator.
We also demonstrate the use of our estimator for tuning the projection dimension on both real and synthetic data.
arXiv Detail & Related papers (2020-04-17T12:47:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.