AMULET: Adaptive Matrix-Multiplication-Like Tasks
- URL: http://arxiv.org/abs/2305.08872v1
- Date: Fri, 12 May 2023 17:04:24 GMT
- Title: AMULET: Adaptive Matrix-Multiplication-Like Tasks
- Authors: Junyoung Kim, Kenneth Ross, Eric Sedlar, Lukas Stadler
- Abstract summary: We extend an open-source compiler to recognize and optimize matrix multiplication-like tasks.
Our framework, called Amulet, uses both database-style and compiler optimization techniques.
Amulet typically performs within 15% of hand-tuned matrix multiplication libraries, while handling a much broader class of computations.
- Score: 6.094431019524036
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Many useful tasks in data science and machine learning applications can be
written as simple variations of matrix multiplication. However, users have
difficulty performing such tasks as existing matrix/vector libraries support
only a limited class of computations hand-tuned for each unique hardware
platform. Users can alternatively write the task as a simple nested loop but
current compilers are not sophisticated enough to generate fast code for the
task written in this way. To address these issues, we extend an open-source
compiler to recognize and optimize these matrix multiplication-like tasks. Our
framework, called Amulet, uses both database-style and compiler optimization
techniques to generate fast code tailored to its execution environment. We show
through experiments that Amulet achieves speedups on a variety of matrix
multiplication-like tasks compared to existing compilers. For large matrices
Amulet typically performs within 15% of hand-tuned matrix multiplication
libraries, while handling a much broader class of computations.
Related papers
- Masked Matrix Multiplication for Emergent Sparsity [1.4786952412297807]
Transformer models exhibit emergent sparsity in which computations perform selective sparse access to dense data.
We build a vectorized and parallel matrix-multiplication system A X B = C that eliminates unnecessary computations.
arXiv Detail & Related papers (2024-02-21T20:36:08Z) - CoLA: Exploiting Compositional Structure for Automatic and Efficient
Numerical Linear Algebra [62.37017125812101]
We propose a simple but general framework for large-scale linear algebra problems in machine learning, named CoLA.
By combining a linear operator abstraction with compositional dispatch rules, CoLA automatically constructs memory and runtime efficient numerical algorithms.
We showcase its efficacy across a broad range of applications, including partial differential equations, Gaussian processes, equivariant model construction, and unsupervised learning.
arXiv Detail & Related papers (2023-09-06T14:59:38Z) - Batch-efficient EigenDecomposition for Small and Medium Matrices [65.67315418971688]
EigenDecomposition (ED) is at the heart of many computer vision algorithms and applications.
We propose a QR-based ED method dedicated to the application scenarios of computer vision.
arXiv Detail & Related papers (2022-07-09T09:14:12Z) - Efficient GPU implementation of randomized SVD and its applications [17.71779625877989]
Matrix decompositions are ubiquitous in machine learning, including applications in dimensionality data compression and deep learning algorithms.
Typical solutions for matrix decompositions have complexity which significantly increases their computational cost and time.
We leverage efficient processing operations that can be run in parallel on modern Graphical Processing Units (GPUs) to reduce the computational burden of computing matrix decompositions.
arXiv Detail & Related papers (2021-10-05T07:42:41Z) - Multiplying Matrices Without Multiplying [0.0]
Multiplying matrices is among the most fundamental and compute-intensive operations in machine learning.
We introduce a learning-based algorithm for this task that greatly outperforms existing methods.
arXiv Detail & Related papers (2021-06-21T05:08:54Z) - A matrix math facility for Power ISA(TM) processors [0.16910097443356495]
A new family of matrix math instructions, collectively known as the Matrix-Multiply Assist facility, has been introduced in Power ISA(TM) Version 3.1.
These instructions have led to a power- and area-efficient implementation of a high throughput math engine in the future POWER10 processor.
Performance per core is 4 times better, at constant frequency, than the previous generation POWER9 processor.
arXiv Detail & Related papers (2021-04-07T14:17:32Z) - What if Neural Networks had SVDs? [66.91160214071088]
Various Neural Networks employ time-consuming matrix operations like matrix inversion.
We present an algorithm that is fast enough to speed up several matrix operations.
arXiv Detail & Related papers (2020-09-29T12:58:52Z) - Kernel methods through the roof: handling billions of points efficiently [94.31450736250918]
Kernel methods provide an elegant and principled approach to nonparametric learning, but so far could hardly be used in large scale problems.
Recent advances have shown the benefits of a number of algorithmic ideas, for example combining optimization, numerical linear algebra and random projections.
Here, we push these efforts further to develop and test a solver that takes full advantage of GPU hardware.
arXiv Detail & Related papers (2020-06-18T08:16:25Z) - PolyDL: Polyhedral Optimizations for Creation of High Performance DL
primitives [55.79741270235602]
We present compiler algorithms to automatically generate high performance implementations of Deep Learning primitives.
We develop novel data reuse analysis algorithms using the polyhedral model.
We also show that such a hybrid compiler plus a minimal library-use approach results in state-of-the-art performance.
arXiv Detail & Related papers (2020-06-02T06:44:09Z) - Sketching Transformed Matrices with Applications to Natural Language
Processing [76.6222695417524]
We propose a space-efficient sketching algorithm for computing the product of a given small matrix with the transformed matrix.
We show that our approach obtains small error and is efficient in both space and time.
arXiv Detail & Related papers (2020-02-23T03:07:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.