On the Connection Between Non-negative Matrix Factorization and Latent Dirichlet Allocation
- URL: http://arxiv.org/abs/2405.20542v1
- Date: Thu, 30 May 2024 23:54:17 GMT
- Title: On the Connection Between Non-negative Matrix Factorization and Latent Dirichlet Allocation
- Authors: Benedikt Geiger, Peter J. Park,
- Abstract summary: Non-negative matrix factorization with the generalized Kullback-Leibler divergence (NMF) and latent Dirichlet allocation (LDA) are two popular approaches for dimensionality reduction of non-negative data.
We show that NMF with normalization constraints on the columns of both matrices of the decomposition and a Dirichlet prior on the columns of one matrix is equivalent to LDA.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Non-negative matrix factorization with the generalized Kullback-Leibler divergence (NMF) and latent Dirichlet allocation (LDA) are two popular approaches for dimensionality reduction of non-negative data. Here, we show that NMF with $\ell_1$ normalization constraints on the columns of both matrices of the decomposition and a Dirichlet prior on the columns of one matrix is equivalent to LDA. To show this, we demonstrate that explicitly accounting for the scaling ambiguity of NMF by adding $\ell_1$ normalization constraints to the optimization problem allows a joint update of both matrices in the widely used multiplicative updates (MU) algorithm. When both of the matrices are normalized, the joint MU algorithm leads to probabilistic latent semantic analysis (PLSA), which is LDA without a Dirichlet prior. Our approach of deriving joint updates for NMF also reveals that a Lasso penalty on one matrix together with an $\ell_1$ normalization constraint on the other matrix is insufficient to induce any sparsity.
Related papers
- Semi-supervised Symmetric Non-negative Matrix Factorization with Low-Rank Tensor Representation [27.14442336413482]
Semi-supervised symmetric non-negative matrix factorization (SNMF)
We propose a novel SNMF model by seeking low-rank representation for the tensor synthesized by the pairwise constraint matrix.
We then propose an enhanced SNMF model, making the embedding matrix tailored to the above tensor low-rank representation.
arXiv Detail & Related papers (2024-05-04T14:58:47Z) - A Novel Maximum-Entropy-Driven Technique for Low-Rank Orthogonal
Nonnegative Matrix Factorization with $\ell_0$-Norm sparsity Constraint [0.0]
In data-driven control and machine learning, a common requirement involves breaking down large matrices into smaller, low-rank factors.
This paper introduces an innovative solution to the orthogonal nonnegative matrix factorization (ONMF) problem.
The proposed method achieves comparable or improved reconstruction errors in line with the literature.
arXiv Detail & Related papers (2022-10-06T04:30:59Z) - Semi-Supervised Subspace Clustering via Tensor Low-Rank Representation [64.49871502193477]
We propose a novel semi-supervised subspace clustering method, which is able to simultaneously augment the initial supervisory information and construct a discriminative affinity matrix.
Comprehensive experimental results on six commonly-used benchmark datasets demonstrate the superiority of our method over state-of-the-art methods.
arXiv Detail & Related papers (2022-05-21T01:47:17Z) - Log-based Sparse Nonnegative Matrix Factorization for Data
Representation [55.72494900138061]
Nonnegative matrix factorization (NMF) has been widely studied in recent years due to its effectiveness in representing nonnegative data with parts-based representations.
We propose a new NMF method with log-norm imposed on the factor matrices to enhance the sparseness.
A novel column-wisely sparse norm, named $ell_2,log$-(pseudo) norm, is proposed to enhance the robustness of the proposed method.
arXiv Detail & Related papers (2022-04-22T11:38:10Z) - Self-supervised Symmetric Nonnegative Matrix Factorization [82.59905231819685]
Symmetric nonnegative factor matrix (SNMF) has demonstrated to be a powerful method for data clustering.
Inspired by ensemble clustering that aims to seek better clustering results, we propose self-supervised SNMF (S$3$NMF)
We take advantage of the sensitivity to code characteristic of SNMF, without relying on any additional information.
arXiv Detail & Related papers (2021-03-02T12:47:40Z) - Column $\ell_{2,0}$-norm regularized factorization model of low-rank
matrix recovery and its computation [0.9281671380673306]
This paper is concerned with the column $ell_2,0$regularized factorization model of low-rank computation problems.
Numerical experiments are conducted with synthetic and real data examples.
arXiv Detail & Related papers (2020-08-24T14:15:36Z) - Robust Low-rank Matrix Completion via an Alternating Manifold Proximal
Gradient Continuation Method [47.80060761046752]
Robust low-rank matrix completion (RMC) has been studied extensively for computer vision, signal processing and machine learning applications.
This problem aims to decompose a partially observed matrix into the superposition of a low-rank matrix and a sparse matrix, where the sparse matrix captures the grossly corrupted entries of the matrix.
A widely used approach to tackle RMC is to consider a convex formulation, which minimizes the nuclear norm of the low-rank matrix (to promote low-rankness) and the l1 norm of the sparse matrix (to promote sparsity).
In this paper, motivated by some recent works on low-
arXiv Detail & Related papers (2020-08-18T04:46:22Z) - Understanding Implicit Regularization in Over-Parameterized Single Index
Model [55.41685740015095]
We design regularization-free algorithms for the high-dimensional single index model.
We provide theoretical guarantees for the induced implicit regularization phenomenon.
arXiv Detail & Related papers (2020-07-16T13:27:47Z) - Information-Theoretic Limits for the Matrix Tensor Product [8.206394018475708]
This paper studies a high-dimensional inference problem involving the matrix tensor product of random matrices.
On the technical side, this paper introduces some new techniques for the analysis of high-dimensional matrix-preserving signals.
arXiv Detail & Related papers (2020-05-22T17:03:48Z) - Multi-Objective Matrix Normalization for Fine-grained Visual Recognition [153.49014114484424]
Bilinear pooling achieves great success in fine-grained visual recognition (FGVC)
Recent methods have shown that the matrix power normalization can stabilize the second-order information in bilinear features.
We propose an efficient Multi-Objective Matrix Normalization (MOMN) method that can simultaneously normalize a bilinear representation.
arXiv Detail & Related papers (2020-03-30T08:40:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.