Nonnegative matrix factorization and the principle of the common cause
- URL: http://arxiv.org/abs/2509.03652v1
- Date: Wed, 03 Sep 2025 19:02:39 GMT
- Title: Nonnegative matrix factorization and the principle of the common cause
- Authors: E. Khalafyan, A. E. Allahverdyan, A. Hovhannisyan,
- Abstract summary: The principle of the common cause (PCC) is a basic methodological approach in probabilistic causality.<n>It seeks an independent mixture model for the joint probability of two dependent random variables.<n>We show that PCC provides a predictability tool that leads to a robust estimation of the effective rank of NMF.<n>We also show how NMF can be employed for data denoising.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Nonnegative matrix factorization (NMF) is a known unsupervised data-reduction method. The principle of the common cause (PCC) is a basic methodological approach in probabilistic causality, which seeks an independent mixture model for the joint probability of two dependent random variables. It turns out that these two concepts are closely related. This relationship is explored reciprocally for several datasets of gray-scale images, which are conveniently mapped into probability models. On one hand, PCC provides a predictability tool that leads to a robust estimation of the effective rank of NMF. Unlike other estimates (e.g., those based on the Bayesian Information Criteria), our estimate of the rank is stable against weak noise. We show that NMF implemented around this rank produces features (basis images) that are also stable against noise and against seeds of local optimization, thereby effectively resolving the NMF nonidentifiability problem. On the other hand, NMF provides an interesting possibility of implementing PCC in an approximate way, where larger and positively correlated joint probabilities tend to be explained better via the independent mixture model. We work out a clustering method, where data points with the same common cause are grouped into the same cluster. We also show how NMF can be employed for data denoising.
Related papers
- A Provably-Correct and Robust Convex Model for Smooth Separable NMF [16.819543808413716]
Nonnegative matrix factorization (NMF) is a linear dimensionality technique for nonnegative data, with applications such as hyperspectral unmixing and topic modeling.<n>In particular, separability assumes that the basis vectors in the NMF are equal to some columns of the input matrix.<n>We propose a convex model for SSNMF and show that it provably recovers the sought-after factors, even in the presence of noise.
arXiv Detail & Related papers (2025-11-10T13:54:27Z) - When Modalities Conflict: How Unimodal Reasoning Uncertainty Governs Preference Dynamics in MLLMs [15.617378124319472]
Multimodal large language models (MLLMs) must resolve conflicts when different modalities provide contradictory information.<n>We introduce a new framework that decomposes modality following into two fundamental factors: relative reasoning uncertainty and inherent modality preference.
arXiv Detail & Related papers (2025-11-04T04:11:31Z) - Random Normed k-Means: A Paradigm-Shift in Clustering within Probabilistic Metric Spaces [0.7864304771129751]
We introduce the first k-means variant in the literature that operates within a probabilistic metric space.<n>By adopting a probabilistic perspective, our method not only introduces a fresh paradigm but also establishes a rigorous theoretical framework.<n>Our proposed random normed k-means (RNKM) algorithm exhibits a remarkable ability to identify nonlinearly separable structures.
arXiv Detail & Related papers (2025-04-04T20:48:43Z) - Theoretical Insights in Model Inversion Robustness and Conditional Entropy Maximization for Collaborative Inference Systems [89.35169042718739]
collaborative inference enables end users to leverage powerful deep learning models without exposure of sensitive raw data to cloud servers.<n>Recent studies have revealed that these intermediate features may not sufficiently preserve privacy, as information can be leaked and raw data can be reconstructed via model inversion attacks (MIAs)<n>This work first theoretically proves that the conditional entropy of inputs given intermediate features provides a guaranteed lower bound on the reconstruction mean square error (MSE) under any MIA.<n>Then, we derive a differentiable and solvable measure for bounding this conditional entropy based on the Gaussian mixture estimation and propose a conditional entropy algorithm to enhance the inversion robustness
arXiv Detail & Related papers (2025-03-01T07:15:21Z) - Collaborative Heterogeneous Causal Inference Beyond Meta-analysis [68.4474531911361]
We propose a collaborative inverse propensity score estimator for causal inference with heterogeneous data.
Our method shows significant improvements over the methods based on meta-analysis when heterogeneity increases.
arXiv Detail & Related papers (2024-04-24T09:04:36Z) - Finding Rule-Interpretable Non-Negative Data Representation [1.2430809884830318]
We present a version of the NMF approach that merges rule-based descriptions with advantages of part-based representation.<n>In addition to revealing important attributes for latent factors, their interaction and value ranges, this approach allows performing focused embedding.
arXiv Detail & Related papers (2022-06-03T10:20:46Z) - BayesIMP: Uncertainty Quantification for Causal Data Fusion [52.184885680729224]
We study the causal data fusion problem, where datasets pertaining to multiple causal graphs are combined to estimate the average treatment effect of a target variable.
We introduce a framework which combines ideas from probabilistic integration and kernel mean embeddings to represent interventional distributions in the reproducing kernel Hilbert space.
arXiv Detail & Related papers (2021-06-07T10:14:18Z) - Recovery of Joint Probability Distribution from one-way marginals: Low
rank Tensors and Random Projections [2.9929093132587763]
Joint probability mass function (PMF) estimation is a fundamental machine learning problem.
In this work, we link random projections of data to the problem of PMF estimation using ideas from tomography.
We provide a novel algorithm for recovering factors of the tensor from one-way marginals, test it across a variety of synthetic and real-world datasets, and also perform MAP inference on the estimated model for classification.
arXiv Detail & Related papers (2021-03-22T14:00:57Z) - Probabilistic Simplex Component Analysis [66.30587591100566]
PRISM is a probabilistic simplex component analysis approach to identifying the vertices of a data-circumscribing simplex from data.
The problem has a rich variety of applications, the most notable being hyperspectral unmixing in remote sensing and non-negative matrix factorization in machine learning.
arXiv Detail & Related papers (2021-03-18T05:39:00Z) - Disentangling Observed Causal Effects from Latent Confounders using
Method of Moments [67.27068846108047]
We provide guarantees on identifiability and learnability under mild assumptions.
We develop efficient algorithms based on coupled tensor decomposition with linear constraints to obtain scalable and guaranteed solutions.
arXiv Detail & Related papers (2021-01-17T07:48:45Z) - Sparse Separable Nonnegative Matrix Factorization [22.679160149512377]
We propose a new variant of nonnegative matrix factorization (NMF)
Separability requires that the columns of the first NMF factor are equal to columns of the input matrix, while sparsity requires that the columns of the second NMF factor are sparse.
We prove that, in noiseless settings and under mild assumptions, our algorithm recovers the true underlying sources.
arXiv Detail & Related papers (2020-06-13T03:52:29Z) - Learning Likelihoods with Conditional Normalizing Flows [54.60456010771409]
Conditional normalizing flows (CNFs) are efficient in sampling and inference.
We present a study of CNFs where the base density to output space mapping is conditioned on an input x, to model conditional densities p(y|x)
arXiv Detail & Related papers (2019-11-29T19:17:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.