Related papers: Computing rank-revealing factorizations of matrices stored out-of-core

Computing rank-revealing factorizations of matrices stored out-of-core

URL: http://arxiv.org/abs/2002.06960v2
Date: Wed, 4 Mar 2020 12:18:40 GMT
Title: Computing rank-revealing factorizations of matrices stored out-of-core
Authors: Nathan Heavner, Per-Gunnar Martinsson, Gregorio Quintana-Ort\'i
Abstract summary: The paper describes efficient algorithms for computing rank-revealing factorizations of matrices that are too large to fit in RAM. The new algorithms are almost as fast when processing data stored on a hard drive as traditional algorithms are for data stored in main memory.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper describes efficient algorithms for computing rank-revealing factorizations of matrices that are too large to fit in RAM, and must instead be stored on slow external memory devices such as solid-state or spinning disk hard drives (out-of-core or out-of-memory). Traditional algorithms for computing rank revealing factorizations, such as the column pivoted QR factorization, or techniques for computing a full singular value decomposition of a matrix, are very communication intensive. They are naturally expressed as a sequence of matrix-vector operations, which become prohibitively expensive when data is not available in main memory. Randomization allows these methods to be reformulated so that large contiguous blocks of the matrix can be processed in bulk. The paper describes two distinct methods. The first is a blocked version of column pivoted Householder QR, organized as a "left-looking" method to minimize the number of write operations (which are more expensive than read operations on a spinning disk drive). The second method results in a so called UTV factorization which expresses a matrix $A$ as $A = U T V^*$ where $U$ and $V$ are unitary, and $T$ is triangular. This method is organized as an algorithm-by-blocks, in which floating point operations overlap read and write operations. The second method incorporates power iterations, and is exceptionally good at revealing the numerical rank; it can often be used as a substitute for a full singular value decomposition. Numerical experiments demonstrate that the new algorithms are almost as fast when processing data stored on a hard drive as traditional algorithms are for data stored in main memory. To be precise, the computational time for fully factorizing an $n\times n$ matrix scales as $cn^{3}$, with a scaling constant $c$ that is only marginally larger when the matrix is stored out of core.

Related papers

Improving Algorithmic Efficiency using Cryptography [11.496343300483904]
We show how to use cryptography to improve the time complexity of solving computational problems. We show that under standard cryptographic assumptions, we can design algorithms that are determinantally faster than existing ones.
arXiv Detail & Related papers (2025-02-18T17:08:59Z)
Optimized Inference for 1.58-bit LLMs: A Time and Memory-Efficient Algorithm for Binary and Ternary Matrix Multiplication [8.779871128906787]
Large Language Models (LLMs) suffer from inference inefficiency while relying on advanced computational infrastructure. We propose algorithms to improve the inference time and memory efficiency of 1.58-bit LLMs with ternary weight matrices. Our results confirm the superiority of the approach both with respect to time and memory, as we observed a reduction in inference time up to 29x and memory usage up to 6x.
arXiv Detail & Related papers (2024-11-10T04:56:14Z)
An Efficient Algorithm for Clustered Multi-Task Compressive Sensing [60.70532293880842]
Clustered multi-task compressive sensing is a hierarchical model that solves multiple compressive sensing tasks. The existing inference algorithm for this model is computationally expensive and does not scale well in high dimensions. We propose a new algorithm that substantially accelerates model inference by avoiding the need to explicitly compute these covariance matrices.
arXiv Detail & Related papers (2023-09-30T15:57:14Z)
Batch-efficient EigenDecomposition for Small and Medium Matrices [65.67315418971688]
EigenDecomposition (ED) is at the heart of many computer vision algorithms and applications. We propose a QR-based ED method dedicated to the application scenarios of computer vision.
arXiv Detail & Related papers (2022-07-09T09:14:12Z)
A Structured Sparse Neural Network and Its Matrix Calculations Algorithm [0.0]
We introduce a nonsymmetric, tridiagonal matrix with offdiagonal sparse entries and offset sub and super-diagonals. For the cases where the matrix inverse does not exist, a least square type pseudoinverse is provided. Results show significant improvement in computational costs specially when the size of matrix increases.
arXiv Detail & Related papers (2022-07-02T19:38:48Z)
Fast Differentiable Matrix Square Root and Inverse Square Root [65.67315418971688]
We propose two more efficient variants to compute the differentiable matrix square root and the inverse square root. For the forward propagation, one method is to use Matrix Taylor Polynomial (MTP), and the other method is to use Matrix Pad'e Approximants (MPA) A series of numerical tests show that both methods yield considerable speed-up compared with the SVD or the NS iteration.
arXiv Detail & Related papers (2022-01-29T10:00:35Z)
Fast Differentiable Matrix Square Root [65.67315418971688]
We propose two more efficient variants to compute the differentiable matrix square root. For the forward propagation, one method is to use Matrix Taylor Polynomial (MTP) The other method is to use Matrix Pad'e Approximants (MPA)
arXiv Detail & Related papers (2022-01-21T12:18:06Z)
Sublinear Time Approximation of Text Similarity Matrices [50.73398637380375]
We introduce a generalization of the popular Nystr"om method to the indefinite setting. Our algorithm can be applied to any similarity matrix and runs in sublinear time in the size of the matrix. We show that our method, along with a simple variant of CUR decomposition, performs very well in approximating a variety of similarity matrices.
arXiv Detail & Related papers (2021-12-17T17:04:34Z)
Sparse Factorization of Large Square Matrices [10.94053598642913]
In this paper, we propose to approximate a large square matrix with a product of sparse full-rank matrices. In the approximation, our method needs only $N(log N)2$ non-zero numbers for an $Ntimes N$ full matrix. We show that our method gives a better approximation when the approximated matrix is sparse and high-rank.
arXiv Detail & Related papers (2021-09-16T18:42:21Z)
Multiplying Matrices Without Multiplying [0.0]
Multiplying matrices is among the most fundamental and compute-intensive operations in machine learning. We introduce a learning-based algorithm for this task that greatly outperforms existing methods.
arXiv Detail & Related papers (2021-06-21T05:08:54Z)
Constant-Depth and Subcubic-Size Threshold Circuits for Matrix Multiplication [1.9518237361775532]
Recent advances in large-scale neural computing hardware has made their practical implementation a near-term possibility. We describe a theoretical approach for multiplying two $N$ by $N$ matrices that integrates threshold gate logic. Dense matrix multiplication is a core operation in convolutional neural network training.
arXiv Detail & Related papers (2020-06-25T18:28:10Z)
Sketching Transformed Matrices with Applications to Natural Language Processing [76.6222695417524]
We propose a space-efficient sketching algorithm for computing the product of a given small matrix with the transformed matrix. We show that our approach obtains small error and is efficient in both space and time.
arXiv Detail & Related papers (2020-02-23T03:07:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.