a-Tucker: Input-Adaptive and Matricization-Free Tucker Decomposition for
Dense Tensors on CPUs and GPUs
- URL: http://arxiv.org/abs/2010.10131v1
- Date: Tue, 20 Oct 2020 08:52:14 GMT
- Title: a-Tucker: Input-Adaptive and Matricization-Free Tucker Decomposition for
Dense Tensors on CPUs and GPUs
- Authors: Min Li and Chuanfu Xiao and Chao Yang
- Abstract summary: A-Tucker is a new framework for input-adaptive and matricization-free Tucker decomposition of dense tensors.
A machine-learning adaptive solver selector is applied to automatically cope with the variations of both the input data and the hardware.
- Score: 6.308492837096872
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Tucker decomposition is one of the most popular models for analyzing and
compressing large-scale tensorial data. Existing Tucker decomposition
algorithms usually rely on a single solver to compute the factor matrices and
core tensor, and are not flexible enough to adapt with the diversities of the
input data and the hardware. Moreover, to exploit highly efficient GEMM
kernels, most Tucker decomposition implementations make use of explicit
matricizations, which could introduce extra costs in terms of data conversion
and memory usage. In this paper, we present a-Tucker, a new framework for
input-adaptive and matricization-free Tucker decomposition of dense tensors. A
mode-wise flexible Tucker decomposition algorithm is proposed to enable the
switch of different solvers for the factor matrices and core tensor, and a
machine-learning adaptive solver selector is applied to automatically cope with
the variations of both the input data and the hardware. To further improve the
performance and enhance the memory efficiency, we implement a-Tucker in a fully
matricization-free manner without any conversion between tensors and matrices.
Experiments with a variety of synthetic and real-world tensors show that
a-Tucker can substantially outperform existing works on both CPUs and GPUs.
Related papers
- Compute Better Spent: Replacing Dense Layers with Structured Matrices [77.61728033234233]
We identify more efficient alternatives to dense matrices, as exemplified by the success of convolutional networks in the image domain.
We show that different structures often require drastically different initialization scales and learning rates, which are crucial to performance.
We propose a novel matrix family containing Monarch matrices, the Block-Train, which we show performs better than dense for the same compute on multiple tasks.
arXiv Detail & Related papers (2024-06-10T13:25:43Z) - HEAT: Hardware-Efficient Automatic Tensor Decomposition for Transformer
Compression [69.36555801766762]
We propose a hardware-aware tensor decomposition framework, dubbed HEAT, that enables efficient exploration of the exponential space of possible decompositions.
We experimentally show that our hardware-aware factorized BERT variants reduce the energy-delay product by 5.7x with less than 1.1% accuracy loss.
arXiv Detail & Related papers (2022-11-30T05:31:45Z) - Tucker-O-Minus Decomposition for Multi-view Tensor Subspace Clustering [36.790637575875635]
We propose a new tensor decomposition called Tucker-O-Minus Decomposition (TOMD) for multi-view clustering.
Numerical experiments on six benchmark data sets demonstrate the superiority of our proposed method in terms of F-score, precision, recall, normalized mutual information, adjusted rand index, and accuracy.
arXiv Detail & Related papers (2022-10-23T07:20:22Z) - Softmax-free Linear Transformers [90.83157268265654]
Vision transformers (ViTs) have pushed the state-of-the-art for visual perception tasks.
Existing methods are either theoretically flawed or empirically ineffective for visual recognition.
We propose a family of Softmax-Free Transformers (SOFT)
arXiv Detail & Related papers (2022-07-05T03:08:27Z) - Equivariant vector field network for many-body system modeling [65.22203086172019]
Equivariant Vector Field Network (EVFN) is built on a novel equivariant basis and the associated scalarization and vectorization layers.
We evaluate our method on predicting trajectories of simulated Newton mechanics systems with both full and partially observed data.
arXiv Detail & Related papers (2021-10-26T14:26:25Z) - SOFT: Softmax-free Transformer with Linear Complexity [112.9754491864247]
Vision transformers (ViTs) have pushed the state-of-the-art for various visual recognition tasks by patch-wise image tokenization followed by self-attention.
Various attempts on approximating the self-attention with linear complexity have been made in Natural Language Processing.
We identify that their limitations are rooted in keeping the softmax self-attention during approximations.
For the first time, a softmax-free transformer or SOFT is proposed.
arXiv Detail & Related papers (2021-10-22T17:57:29Z) - Fast Low-Rank Tensor Decomposition by Ridge Leverage Score Sampling [5.740578698172382]
We study Tucker decompositions and use tools from randomized numerical linear algebra called ridge leverage scores.
We show how to use approximate ridge leverage scores to construct a sketched instance for any ridge regression problem.
We demonstrate the effectiveness of our approximate ridge regressioni algorithm for large, low-rank Tucker decompositions on both synthetic and real-world data.
arXiv Detail & Related papers (2021-07-22T13:32:47Z) - Low-Rank and Sparse Enhanced Tucker Decomposition for Tensor Completion [3.498620439731324]
We introduce a unified low-rank and sparse enhanced Tucker decomposition model for tensor completion.
Our model possesses a sparse regularization term to promote a sparse core tensor, which is beneficial for tensor data compression.
It is remarkable that our model is able to deal with different types of real-world data sets, since it exploits the potential periodicity and inherent correlation properties appeared in tensors.
arXiv Detail & Related papers (2020-10-01T12:45:39Z) - Tensor Relational Algebra for Machine Learning System Design [7.764107702934616]
We present an alternative implementation abstraction called the relational tensor algebra (TRA)
TRA is a set-based algebra based on the relational algebra.
Our empirical study shows that the optimized TRA-based back-end can significantly outperform alternatives for running ML in distributed clusters.
arXiv Detail & Related papers (2020-09-01T15:51:24Z) - Spectral Learning on Matrices and Tensors [74.88243719463053]
We show that tensor decomposition can pick up latent effects that are missed by matrix methods.
We also outline computational techniques to design efficient tensor decomposition methods.
arXiv Detail & Related papers (2020-04-16T22:53:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.