Tensor Relational Algebra for Machine Learning System Design
- URL: http://arxiv.org/abs/2009.00524v3
- Date: Mon, 9 Aug 2021 08:35:09 GMT
- Title: Tensor Relational Algebra for Machine Learning System Design
- Authors: Binhang Yuan and Dimitrije Jankov and Jia Zou and Yuxin Tang and
Daniel Bourgeois and Chris Jermaine
- Abstract summary: We present an alternative implementation abstraction called the relational tensor algebra (TRA)
TRA is a set-based algebra based on the relational algebra.
Our empirical study shows that the optimized TRA-based back-end can significantly outperform alternatives for running ML in distributed clusters.
- Score: 7.764107702934616
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider the question: what is the abstraction that should be implemented
by the computational engine of a machine learning system? Current machine
learning systems typically push whole tensors through a series of compute
kernels such as matrix multiplications or activation functions, where each
kernel runs on an AI accelerator (ASIC) such as a GPU. This implementation
abstraction provides little built-in support for ML systems to scale past a
single machine, or for handling large models with matrices or tensors that do
not easily fit into the RAM of an ASIC. In this paper, we present an
alternative implementation abstraction called the tensor relational algebra
(TRA). The TRA is a set-based algebra based on the relational algebra.
Expressions in the TRA operate over binary tensor relations, where keys are
multi-dimensional arrays and values are tensors. The TRA is easily executed
with high efficiency in a parallel or distributed environment, and amenable to
automatic optimization. Our empirical study shows that the optimized TRA-based
back-end can significantly outperform alternatives for running ML workflows in
distributed clusters.
Related papers
- Compressing Structured Tensor Algebra [1.2624532490634646]
DASTAC is a framework to propagate the tensors's captured high-level structure down to low-level code generation.
Our methodology reduces memory footprint by automatically detecting the best data layout.
We show that DASTAC achieves 1 to 2 orders of magnitude speedup over TACO, a state-of-the-art sparse tensor compiler, and StructTensor, a state-of-the-art structured tensor algebra compiler.
arXiv Detail & Related papers (2024-07-18T17:25:17Z) - Compute Better Spent: Replacing Dense Layers with Structured Matrices [77.61728033234233]
We identify more efficient alternatives to dense matrices, as exemplified by the success of convolutional networks in the image domain.
We show that different structures often require drastically different initialization scales and learning rates, which are crucial to performance.
We propose a novel matrix family containing Monarch matrices, the Block-Train, which we show performs better than dense for the same compute on multiple tasks.
arXiv Detail & Related papers (2024-06-10T13:25:43Z) - CoLA: Exploiting Compositional Structure for Automatic and Efficient
Numerical Linear Algebra [62.37017125812101]
We propose a simple but general framework for large-scale linear algebra problems in machine learning, named CoLA.
By combining a linear operator abstraction with compositional dispatch rules, CoLA automatically constructs memory and runtime efficient numerical algorithms.
We showcase its efficacy across a broad range of applications, including partial differential equations, Gaussian processes, equivariant model construction, and unsupervised learning.
arXiv Detail & Related papers (2023-09-06T14:59:38Z) - Low-Rank Multitask Learning based on Tensorized SVMs and LSSVMs [65.42104819071444]
Multitask learning (MTL) leverages task-relatedness to enhance performance.
We employ high-order tensors, with each mode corresponding to a task index, to naturally represent tasks referenced by multiple indices.
We propose a general framework of low-rank MTL methods with tensorized support vector machines (SVMs) and least square support vector machines (LSSVMs)
arXiv Detail & Related papers (2023-08-30T14:28:26Z) - Performance and Energy Consumption of Parallel Machine Learning
Algorithms [0.0]
Machine learning models have achieved remarkable success in various real-world applications.
Model training in machine learning requires large-scale data sets and multiple iterations before it can work properly.
Parallelization of training algorithms is a common strategy to speed up the process of training.
arXiv Detail & Related papers (2023-05-01T13:04:39Z) - TensorIR: An Abstraction for Automatic Tensorized Program Optimization [22.812702519665617]
We presentIR, a compiler for optimizing programs with tensor computation primitives.
We build an end-to-end framework on top of our compilation to automatically optimize deep learning models for given tensor computation primitives.
arXiv Detail & Related papers (2022-07-09T16:28:57Z) - Benchmarking the Linear Algebra Awareness of TensorFlow and PyTorch [1.1470070927586016]
We develop benchmarks to investigate the linear algebra optimization capabilities of TF and PyT.
In this work, we focus on linear algebra computations in TF and PyT.
arXiv Detail & Related papers (2022-02-20T18:51:00Z) - Spectral Learning on Matrices and Tensors [74.88243719463053]
We show that tensor decomposition can pick up latent effects that are missed by matrix methods.
We also outline computational techniques to design efficient tensor decomposition methods.
arXiv Detail & Related papers (2020-04-16T22:53:00Z) - Stochastic Flows and Geometric Optimization on the Orthogonal Group [52.50121190744979]
We present a new class of geometrically-driven optimization algorithms on the orthogonal group $O(d)$.
We show that our methods can be applied in various fields of machine learning including deep, convolutional and recurrent neural networks, reinforcement learning, flows and metric learning.
arXiv Detail & Related papers (2020-03-30T15:37:50Z) - Sketching Transformed Matrices with Applications to Natural Language
Processing [76.6222695417524]
We propose a space-efficient sketching algorithm for computing the product of a given small matrix with the transformed matrix.
We show that our approach obtains small error and is efficient in both space and time.
arXiv Detail & Related papers (2020-02-23T03:07:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.