Fast Matrix Multiplication Without Tears: A Constraint Programming
Approach
- URL: http://arxiv.org/abs/2306.01097v2
- Date: Mon, 17 Jul 2023 15:14:53 GMT
- Title: Fast Matrix Multiplication Without Tears: A Constraint Programming
Approach
- Authors: Arnaud Deza, Chang Liu, Pashootan Vaezipoor, Elias B. Khalil
- Abstract summary: It is known that the multiplication of an $N times M$ matrix with an $M times P$ matrix can be performed using fewer multiplications than what the naive $NMP approach suggests.
This gives rise to the constraint satisfaction problem of fast matrix multiplication.
We propose a simple yet novel Constraint Programming approach to find non-commutative algorithms for fast matrix multiplication.
- Score: 8.52818380743467
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: It is known that the multiplication of an $N \times M$ matrix with an $M
\times P$ matrix can be performed using fewer multiplications than what the
naive $NMP$ approach suggests. The most famous instance of this is Strassen's
algorithm for multiplying two $2\times 2$ matrices in 7 instead of 8
multiplications. This gives rise to the constraint satisfaction problem of fast
matrix multiplication, where a set of $R < NMP$ multiplication terms must be
chosen and combined such that they satisfy correctness constraints on the
output matrix. Despite its highly combinatorial nature, this problem has not
been exhaustively examined from that perspective, as evidenced for example by
the recent deep reinforcement learning approach of AlphaTensor. In this work,
we propose a simple yet novel Constraint Programming approach to find
non-commutative algorithms for fast matrix multiplication or provide proof of
infeasibility otherwise. We propose a set of symmetry-breaking constraints and
valid inequalities that are particularly helpful in proving infeasibility. On
the feasible side, we find that exploiting solver performance variability in
conjunction with a sparsity-based problem decomposition enables finding
solutions for larger (feasible) instances of fast matrix multiplication. Our
experimental results using CP Optimizer demonstrate that we can find fast
matrix multiplication algorithms for matrices up to $3\times 3$ in a short
amount of time.
Related papers
- Optimized Inference for 1.58-bit LLMs: A Time and Memory-Efficient Algorithm for Binary and Ternary Matrix Multiplication [8.779871128906787]
Large Language Models (LLMs) suffer from inference inefficiency while relying on advanced computational infrastructure.
We propose algorithms to improve the inference time and memory efficiency of 1.58-bit LLMs with ternary weight matrices.
Our results confirm the superiority of the approach both with respect to time and memory, as we observed a reduction in inference time up to 29x and memory usage up to 6x.
arXiv Detail & Related papers (2024-11-10T04:56:14Z) - Optimal Quantization for Matrix Multiplication [35.007966885532724]
We present a universal quantizer based on nested lattices with an explicit guarantee of approximation error for any (non-random) pair of matrices $A$, $B$ in terms of only Frobenius norms $|A|_F, |B|_F$ and $|Atop B|_F$.
arXiv Detail & Related papers (2024-10-17T17:19:48Z) - Fine-grained Analysis and Faster Algorithms for Iteratively Solving Linear Systems [9.30306458153248]
We consider the spectral tail condition number, $kappa_ell$, defined as the ratio between the $ell$th largest and the smallest singular value of the matrix representing the system.
Some of the implications of our result, and of the use of $kappa_ell$, include direct improvement over a fine-grained analysis of the Conjugate method.
arXiv Detail & Related papers (2024-05-09T14:56:49Z) - Quantum Time-Space Tradeoffs for Matrix Problems [0.5524804393257919]
We consider the time and space required for quantum computers to solve a range of problems involving matrices.
For almost all matrices $A$, we prove that quantum circuits with at most $T$ input queries and $S$ qubits of memory require $T=Omega(n2/S)$.
Because many of our lower bounds match deterministic algorithms with the same time and space complexity, we show that quantum computers cannot provide any advantage for these problems with any space bound.
arXiv Detail & Related papers (2024-01-10T18:38:43Z) - High-Dimensional Sparse Bayesian Learning without Covariance Matrices [66.60078365202867]
We introduce a new inference scheme that avoids explicit construction of the covariance matrix.
Our approach couples a little-known diagonal estimation result from numerical linear algebra with the conjugate gradient algorithm.
On several simulations, our method scales better than existing approaches in computation time and memory.
arXiv Detail & Related papers (2022-02-25T16:35:26Z) - Fast Differentiable Matrix Square Root and Inverse Square Root [65.67315418971688]
We propose two more efficient variants to compute the differentiable matrix square root and the inverse square root.
For the forward propagation, one method is to use Matrix Taylor Polynomial (MTP), and the other method is to use Matrix Pad'e Approximants (MPA)
A series of numerical tests show that both methods yield considerable speed-up compared with the SVD or the NS iteration.
arXiv Detail & Related papers (2022-01-29T10:00:35Z) - Fast Differentiable Matrix Square Root [65.67315418971688]
We propose two more efficient variants to compute the differentiable matrix square root.
For the forward propagation, one method is to use Matrix Taylor Polynomial (MTP)
The other method is to use Matrix Pad'e Approximants (MPA)
arXiv Detail & Related papers (2022-01-21T12:18:06Z) - Multiplying Matrices Without Multiplying [0.0]
Multiplying matrices is among the most fundamental and compute-intensive operations in machine learning.
We introduce a learning-based algorithm for this task that greatly outperforms existing methods.
arXiv Detail & Related papers (2021-06-21T05:08:54Z) - Non-PSD Matrix Sketching with Applications to Regression and
Optimization [56.730993511802865]
We present dimensionality reduction methods for non-PSD and square-roots" matrices.
We show how these techniques can be used for multiple downstream tasks.
arXiv Detail & Related papers (2021-06-16T04:07:48Z) - What if Neural Networks had SVDs? [66.91160214071088]
Various Neural Networks employ time-consuming matrix operations like matrix inversion.
We present an algorithm that is fast enough to speed up several matrix operations.
arXiv Detail & Related papers (2020-09-29T12:58:52Z) - Sketching Transformed Matrices with Applications to Natural Language
Processing [76.6222695417524]
We propose a space-efficient sketching algorithm for computing the product of a given small matrix with the transformed matrix.
We show that our approach obtains small error and is efficient in both space and time.
arXiv Detail & Related papers (2020-02-23T03:07:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.