Related papers: Tensor Relational Algebra for Machine Learning System Design

Tensor Relational Algebra for Machine Learning System Design

URL: http://arxiv.org/abs/2009.00524v3
Date: Mon, 9 Aug 2021 08:35:09 GMT
Title: Tensor Relational Algebra for Machine Learning System Design
Authors: Binhang Yuan and Dimitrije Jankov and Jia Zou and Yuxin Tang and Daniel Bourgeois and Chris Jermaine
Abstract summary: We present an alternative implementation abstraction called the relational tensor algebra (TRA) TRA is a set-based algebra based on the relational algebra. Our empirical study shows that the optimized TRA-based back-end can significantly outperform alternatives for running ML in distributed clusters.
Score: 7.764107702934616
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We consider the question: what is the abstraction that should be implemented by the computational engine of a machine learning system? Current machine learning systems typically push whole tensors through a series of compute kernels such as matrix multiplications or activation functions, where each kernel runs on an AI accelerator (ASIC) such as a GPU. This implementation abstraction provides little built-in support for ML systems to scale past a single machine, or for handling large models with matrices or tensors that do not easily fit into the RAM of an ASIC. In this paper, we present an alternative implementation abstraction called the tensor relational algebra (TRA). The TRA is a set-based algebra based on the relational algebra. Expressions in the TRA operate over binary tensor relations, where keys are multi-dimensional arrays and values are tensors. The TRA is easily executed with high efficiency in a parallel or distributed environment, and amenable to automatic optimization. Our empirical study shows that the optimized TRA-based back-end can significantly outperform alternatives for running ML workflows in distributed clusters.

Related papers

Kolmogorov GAM Networks are all you need! [0.6906005491572398]
Kolmogorov GAM networks are shown to be an efficient architecture for training and inference. They are an additive model with an embedding that is independent of the function of interest.
arXiv Detail & Related papers (2025-01-01T02:46:00Z)
Compressing Structured Tensor Algebra [1.2624532490634646]
DASTAC is a framework to propagate the tensors's captured high-level structure down to low-level code generation. Our methodology reduces memory footprint by automatically detecting the best data layout. We show that DASTAC achieves 1 to 2 orders of magnitude speedup over TACO, a state-of-the-art sparse tensor compiler, and StructTensor, a state-of-the-art structured tensor algebra compiler.
arXiv Detail & Related papers (2024-07-18T17:25:17Z)
Compute Better Spent: Replacing Dense Layers with Structured Matrices [77.61728033234233]
We identify more efficient alternatives to dense matrices, as exemplified by the success of convolutional networks in the image domain. We show that different structures often require drastically different initialization scales and learning rates, which are crucial to performance. We propose a novel matrix family containing Monarch matrices, the Block-Train, which we show performs better than dense for the same compute on multiple tasks.
arXiv Detail & Related papers (2024-06-10T13:25:43Z)
CoLA: Exploiting Compositional Structure for Automatic and Efficient Numerical Linear Algebra [62.37017125812101]
We propose a simple but general framework for large-scale linear algebra problems in machine learning, named CoLA. By combining a linear operator abstraction with compositional dispatch rules, CoLA automatically constructs memory and runtime efficient numerical algorithms. We showcase its efficacy across a broad range of applications, including partial differential equations, Gaussian processes, equivariant model construction, and unsupervised learning.
arXiv Detail & Related papers (2023-09-06T14:59:38Z)
Low-Rank Multitask Learning based on Tensorized SVMs and LSSVMs [65.42104819071444]
Multitask learning (MTL) leverages task-relatedness to enhance performance. We employ high-order tensors, with each mode corresponding to a task index, to naturally represent tasks referenced by multiple indices. We propose a general framework of low-rank MTL methods with tensorized support vector machines (SVMs) and least square support vector machines (LSSVMs)
arXiv Detail & Related papers (2023-08-30T14:28:26Z)
Performance and Energy Consumption of Parallel Machine Learning Algorithms [0.0]
Machine learning models have achieved remarkable success in various real-world applications. Model training in machine learning requires large-scale data sets and multiple iterations before it can work properly. Parallelization of training algorithms is a common strategy to speed up the process of training.
arXiv Detail & Related papers (2023-05-01T13:04:39Z)
TensorIR: An Abstraction for Automatic Tensorized Program Optimization [22.812702519665617]
We presentIR, a compiler for optimizing programs with tensor computation primitives. We build an end-to-end framework on top of our compilation to automatically optimize deep learning models for given tensor computation primitives.
arXiv Detail & Related papers (2022-07-09T16:28:57Z)
Benchmarking the Linear Algebra Awareness of TensorFlow and PyTorch [1.1470070927586016]
We develop benchmarks to investigate the linear algebra optimization capabilities of TF and PyT. In this work, we focus on linear algebra computations in TF and PyT.
arXiv Detail & Related papers (2022-02-20T18:51:00Z)
Spectral Learning on Matrices and Tensors [74.88243719463053]
We show that tensor decomposition can pick up latent effects that are missed by matrix methods. We also outline computational techniques to design efficient tensor decomposition methods.
arXiv Detail & Related papers (2020-04-16T22:53:00Z)
Stochastic Flows and Geometric Optimization on the Orthogonal Group [52.50121190744979]
We present a new class of geometrically-driven optimization algorithms on the orthogonal group $O(d)$. We show that our methods can be applied in various fields of machine learning including deep, convolutional and recurrent neural networks, reinforcement learning, flows and metric learning.
arXiv Detail & Related papers (2020-03-30T15:37:50Z)
Sketching Transformed Matrices with Applications to Natural Language Processing [76.6222695417524]
We propose a space-efficient sketching algorithm for computing the product of a given small matrix with the transformed matrix. We show that our approach obtains small error and is efficient in both space and time.
arXiv Detail & Related papers (2020-02-23T03:07:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.