Related papers: Scalable CP Decomposition for Tensor Learning using GPU Tensor Cores

Scalable CP Decomposition for Tensor Learning using GPU Tensor Cores

URL: http://arxiv.org/abs/2311.13693v1
Date: Wed, 22 Nov 2023 21:04:59 GMT
Title: Scalable CP Decomposition for Tensor Learning using GPU Tensor Cores
Authors: Zeliang Zhang, Zhuo Liu, Susan Liang, Zhiyuan Wang, Yifan Zhu, Chen Ding, Chenliang Xu
Abstract summary: We propose a compression-based tensor decomposition framework, namely the exascale-tensor, to support exascale tensor decomposition. Compared to the baselines, the exascale-tensor supports 8,000x larger tensors and a speedup up to 6.95x. We also apply our method to two real-world applications, including gene analysis and tensor layer neural networks.
Score: 47.87810316745786
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: CP decomposition is a powerful tool for data science, especially gene analysis, deep learning, and quantum computation. However, the application of tensor decomposition is largely hindered by the exponential increment of the computational complexity and storage consumption with the size of tensors. While the data in our real world is usually presented as trillion- or even exascale-scale tensors, existing work can only support billion-scale scale tensors. In our work, we propose the Exascale-Tensor to mitigate the significant gap. Specifically, we propose a compression-based tensor decomposition framework, namely the exascale-tensor, to support exascale tensor decomposition. Then, we carefully analyze the inherent parallelism and propose a bag of strategies to improve computational efficiency. Last, we conduct experiments to decompose tensors ranging from million-scale to trillion-scale for evaluation. Compared to the baselines, the exascale-tensor supports 8,000x larger tensors and a speedup up to 6.95x. We also apply our method to two real-world applications, including gene analysis and tensor layer neural networks, of which the numeric results demonstrate the scalability and effectiveness of our method.

Related papers

Tensor-GaLore: Memory-Efficient Training via Gradient Tensor Decomposition [93.98343072306619]
We present Navier-GaLore, a novel method for efficient training of neural networks with higher-order tensor weights. Across various PDE tasks, Navier-GaLore achieves substantial memory savings, reducing memory usage by up to 75%.
arXiv Detail & Related papers (2025-01-04T20:51:51Z)
Unlocking Data-free Low-bit Quantization with Matrix Decomposition for KV Cache Compression [87.5604418100301]
Key-value( KV) caching is an important technique to accelerate the inference of large language models. Existing methods often compromise precision or require extra data for calibration. We introduce textbfDecoQuant, a novel data-free low-bit quantization technique based on tensor decomposition methods.
arXiv Detail & Related papers (2024-05-21T08:35:10Z)
Low-Rank Tensor Function Representation for Multi-Dimensional Data Recovery [52.21846313876592]
Low-rank tensor function representation (LRTFR) can continuously represent data beyond meshgrid with infinite resolution. We develop two fundamental concepts for tensor functions, i.e., the tensor function rank and low-rank tensor function factorization. Our method substantiates the superiority and versatility of our method as compared with state-of-the-art methods.
arXiv Detail & Related papers (2022-12-01T04:00:38Z)
Latent Matrices for Tensor Network Decomposition and to Tensor Completion [8.301418317685906]
We propose a novel higher-order tensor decomposition model that decomposes the tensor into smaller ones and speeds up the computation of the algorithm. Three optimization algorithms, LMTN-PAM, LMTN-SVD and LMTN-AR, have been developed and applied to the tensor-completion task. Experimental results show that our LMTN-SVD algorithm is 3-6 times faster than the FCTN-PAM algorithm and only a 1.8 points accuracy drop.
arXiv Detail & Related papers (2022-10-07T08:19:50Z)
Error Analysis of Tensor-Train Cross Approximation [88.83467216606778]
We provide accuracy guarantees in terms of the entire tensor for both exact and noisy measurements. Results are verified by numerical experiments, and may have important implications for the usefulness of cross approximations for high-order tensors.
arXiv Detail & Related papers (2022-07-09T19:33:59Z)
Fast and Provable Tensor Robust Principal Component Analysis via Scaled Gradient Descent [30.299284742925852]
This paper tackles tensor robust principal component analysis (RPCA) It aims to recover a low-rank tensor from its observations contaminated by sparse corruptions. We show that the proposed algorithm achieves better and more scalable performance than state-of-the-art matrix and tensor RPCA algorithms.
arXiv Detail & Related papers (2022-06-18T04:01:32Z)
Augmented Tensor Decomposition with Stochastic Optimization [46.16865811396394]
Real-world tensor data are usually high-ordered and have large dimensions with millions or billions of entries. It is expensive to decompose the whole tensor with traditional algorithms. This paper proposes augmented tensor decomposition, which effectively incorporates data augmentations to boost downstream classification.
arXiv Detail & Related papers (2021-06-15T06:29:05Z)
MTC: Multiresolution Tensor Completion from Partial and Coarse Observations [49.931849672492305]
Existing completion formulation mostly relies on partial observations from a single tensor. We propose an efficient Multi-resolution Completion model (MTC) to solve the problem.
arXiv Detail & Related papers (2021-06-14T02:20:03Z)
Scaling and Scalability: Provable Nonconvex Low-Rank Tensor Estimation from Incomplete Measurements [30.395874385570007]
A fundamental task is to faithfully recover tensors from highly incomplete measurements. We develop an algorithm to directly recover the tensor factors in the Tucker decomposition. We show that it provably converges at a linear independent rate of the ground truth tensor for two canonical problems.
arXiv Detail & Related papers (2021-04-29T17:44:49Z)
Geometric All-Way Boolean Tensor Decomposition [14.065968221500246]
We present GETF, which sequentially identifies the rank-1 basis for a tensor from a geometric perspective. Experiments on both synthetic and real-world data demonstrated that GETF has significantly improved performance in reconstruction accuracy, extraction of latent structures and it is an order of magnitude faster than other state-of-the-art methods.
arXiv Detail & Related papers (2020-07-31T03:29:44Z)

This list is automatically generated from the titles and abstracts of the papers in this site.