Randomized Algorithms for Symmetric Nonnegative Matrix Factorization
- URL: http://arxiv.org/abs/2402.08134v1
- Date: Tue, 13 Feb 2024 00:02:05 GMT
- Title: Randomized Algorithms for Symmetric Nonnegative Matrix Factorization
- Authors: Koby Hayashi, Sinan G. Aksoy, Grey Ballard, Haesun Park
- Abstract summary: Symmetric Nonnegative Matrix Factorization (SymNMF) is a technique in data analysis and machine learning.
We develop two randomized algorithms for its computation.
We show that our methods approximately maintain solution quality and achieve significant speed ups for both large dense and large sparse problems.
- Score: 2.1753766244387402
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Symmetric Nonnegative Matrix Factorization (SymNMF) is a technique in data
analysis and machine learning that approximates a symmetric matrix with a
product of a nonnegative, low-rank matrix and its transpose. To design faster
and more scalable algorithms for SymNMF we develop two randomized algorithms
for its computation. The first algorithm uses randomized matrix sketching to
compute an initial low-rank input matrix and proceeds to use this input to
rapidly compute a SymNMF. The second algorithm uses randomized leverage score
sampling to approximately solve constrained least squares problems. Many
successful methods for SymNMF rely on (approximately) solving sequences of
constrained least squares problems. We prove theoretically that leverage score
sampling can approximately solve nonnegative least squares problems to a chosen
accuracy with high probability. Finally we demonstrate that both methods work
well in practice by applying them to graph clustering tasks on large real world
data sets. These experiments show that our methods approximately maintain
solution quality and achieve significant speed ups for both large dense and
large sparse problems.
Related papers
- Accelerating Cutting-Plane Algorithms via Reinforcement Learning
Surrogates [49.84541884653309]
A current standard approach to solving convex discrete optimization problems is the use of cutting-plane algorithms.
Despite the existence of a number of general-purpose cut-generating algorithms, large-scale discrete optimization problems continue to suffer from intractability.
We propose a method for accelerating cutting-plane algorithms via reinforcement learning.
arXiv Detail & Related papers (2023-07-17T20:11:56Z) - Learning the Positions in CountSketch [49.57951567374372]
We consider sketching algorithms which first compress data by multiplication with a random sketch matrix, and then apply the sketch to quickly solve an optimization problem.
In this work, we propose the first learning-based algorithms that also optimize the locations of the non-zero entries.
arXiv Detail & Related papers (2023-06-11T07:28:35Z) - Asymmetric Scalable Cross-modal Hashing [51.309905690367835]
Cross-modal hashing is a successful method to solve large-scale multimedia retrieval issue.
We propose a novel Asymmetric Scalable Cross-Modal Hashing (ASCMH) to address these issues.
Our ASCMH outperforms the state-of-the-art cross-modal hashing methods in terms of accuracy and efficiency.
arXiv Detail & Related papers (2022-07-26T04:38:47Z) - Matrix Reordering for Noisy Disordered Matrices: Optimality and
Computationally Efficient Algorithms [9.245687221460654]
Motivated by applications in single-cell biology and metagenomics, we investigate the problem of matrixing based on a noisy monotone Toeplitz matrix model.
We establish fundamental statistical limit for this problem in a decision-theoretic framework and demonstrate that a constrained least squares rate.
To address this, we propose a novel-time adaptive sorting algorithm with guaranteed performance improvement.
arXiv Detail & Related papers (2022-01-17T14:53:52Z) - Fast Projected Newton-like Method for Precision Matrix Estimation under
Total Positivity [15.023842222803058]
Current algorithms are designed using the block coordinate descent method or the proximal point algorithm.
We propose a novel algorithm based on the two-metric projection method, incorporating a carefully designed search direction and variable partitioning scheme.
Experimental results on synthetic and real-world datasets demonstrate that our proposed algorithm provides a significant improvement in computational efficiency compared to the state-of-the-art methods.
arXiv Detail & Related papers (2021-12-03T14:39:10Z) - Self-supervised Symmetric Nonnegative Matrix Factorization [82.59905231819685]
Symmetric nonnegative factor matrix (SNMF) has demonstrated to be a powerful method for data clustering.
Inspired by ensemble clustering that aims to seek better clustering results, we propose self-supervised SNMF (S$3$NMF)
We take advantage of the sensitivity to code characteristic of SNMF, without relying on any additional information.
arXiv Detail & Related papers (2021-03-02T12:47:40Z) - Optimal Randomized First-Order Methods for Least-Squares Problems [56.05635751529922]
This class of algorithms encompasses several randomized methods among the fastest solvers for least-squares problems.
We focus on two classical embeddings, namely, Gaussian projections and subsampled Hadamard transforms.
Our resulting algorithm yields the best complexity known for solving least-squares problems with no condition number dependence.
arXiv Detail & Related papers (2020-02-21T17:45:32Z) - Optimal Iterative Sketching with the Subsampled Randomized Hadamard
Transform [64.90148466525754]
We study the performance of iterative sketching for least-squares problems.
We show that the convergence rate for Haar and randomized Hadamard matrices are identical, andally improve upon random projections.
These techniques may be applied to other algorithms that employ randomized dimension reduction.
arXiv Detail & Related papers (2020-02-03T16:17:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.