Cost-efficient Gaussian Tensor Network Embeddings for Tensor-structured
Inputs
- URL: http://arxiv.org/abs/2205.13163v1
- Date: Thu, 26 May 2022 05:27:31 GMT
- Title: Cost-efficient Gaussian Tensor Network Embeddings for Tensor-structured
Inputs
- Authors: Linjian Ma and Edgar Solomonik
- Abstract summary: We use network embeddings to perform dimensionality reduction of tensor network structured inputs $x$.
We provide an algorithm to efficiently sketch input data using such embeddings.
We show optimality of an existing algorithm for train rounding.
- Score: 2.737640280995564
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This work discusses tensor network embeddings, which are random matrices
($S$) with tensor network structure. These embeddings have been used to perform
dimensionality reduction of tensor network structured inputs $x$ and accelerate
applications such as tensor decomposition and kernel regression. Existing works
have designed embeddings for inputs $x$ with specific structures, such that the
computational cost for calculating $Sx$ is efficient. We provide a systematic
way to design tensor network embeddings consisting of Gaussian random tensors,
such that for inputs with more general tensor network structures, both the
sketch size (row size of $S$) and the sketching computational cost are low.
We analyze general tensor network embeddings that can be reduced to a
sequence of sketching matrices. We provide a sufficient condition to quantify
the accuracy of such embeddings and derive sketching asymptotic cost lower
bounds using embeddings that satisfy this condition and have a sketch size
lower than any input dimension. We then provide an algorithm to efficiently
sketch input data using such embeddings. The sketch size of the embedding used
in the algorithm has a linear dependence on the number of sketching dimensions
of the input. Assuming tensor contractions are performed with classical dense
matrix multiplication algorithms, this algorithm achieves asymptotic cost
within a factor of $O(\sqrt{m})$ of our cost lower bound, where $m$ is the
sketch size. Further, when each tensor in the input has a dimension that needs
to be sketched, this algorithm yields the optimal sketching asymptotic cost. We
apply our sketching analysis to inexact tensor decomposition optimization
algorithms. We provide a sketching algorithm for CP decomposition that is
asymptotically faster than existing work in multiple regimes, and show
optimality of an existing algorithm for tensor train rounding.
Related papers
- Adapting Newton's Method to Neural Networks through a Summary of
Higher-Order Derivatives [0.0]
We consider a gradient-based optimization method applied to a function $boldsymboltheta$.
This framework encompasses many common use-cases, such as training neural networks by gradient descent.
arXiv Detail & Related papers (2023-12-06T20:24:05Z) - Efficiently Learning One-Hidden-Layer ReLU Networks via Schur
Polynomials [50.90125395570797]
We study the problem of PAC learning a linear combination of $k$ ReLU activations under the standard Gaussian distribution on $mathbbRd$ with respect to the square loss.
Our main result is an efficient algorithm for this learning task with sample and computational complexity $(dk/epsilon)O(k)$, whereepsilon>0$ is the target accuracy.
arXiv Detail & Related papers (2023-07-24T14:37:22Z) - Efficient Dataset Distillation Using Random Feature Approximation [109.07737733329019]
We propose a novel algorithm that uses a random feature approximation (RFA) of the Neural Network Gaussian Process (NNGP) kernel.
Our algorithm provides at least a 100-fold speedup over KIP and can run on a single GPU.
Our new method, termed an RFA Distillation (RFAD), performs competitively with KIP and other dataset condensation algorithms in accuracy over a range of large-scale datasets.
arXiv Detail & Related papers (2022-10-21T15:56:13Z) - Latent Matrices for Tensor Network Decomposition and to Tensor
Completion [8.301418317685906]
We propose a novel higher-order tensor decomposition model that decomposes the tensor into smaller ones and speeds up the computation of the algorithm.
Three optimization algorithms, LMTN-PAM, LMTN-SVD and LMTN-AR, have been developed and applied to the tensor-completion task.
Experimental results show that our LMTN-SVD algorithm is 3-6 times faster than the FCTN-PAM algorithm and only a 1.8 points accuracy drop.
arXiv Detail & Related papers (2022-10-07T08:19:50Z) - Near-Linear Time and Fixed-Parameter Tractable Algorithms for Tensor
Decompositions [51.19236668224547]
We study low rank approximation of tensors, focusing on the tensor train and Tucker decompositions.
For tensor train decomposition, we give a bicriteria $(1 + eps)$-approximation algorithm with a small bicriteria rank and $O(q cdot nnz(A))$ running time.
In addition, we extend our algorithm to tensor networks with arbitrary graphs.
arXiv Detail & Related papers (2022-07-15T11:55:09Z) - Minimax Optimal Quantization of Linear Models: Information-Theoretic
Limits and Efficient Algorithms [59.724977092582535]
We consider the problem of quantizing a linear model learned from measurements.
We derive an information-theoretic lower bound for the minimax risk under this setting.
We show that our method and upper-bounds can be extended for two-layer ReLU neural networks.
arXiv Detail & Related papers (2022-02-23T02:39:04Z) - Complex-to-Real Sketches for Tensor Products with Applications to the
Polynomial Kernel [15.535749953841274]
Randomized sketches of a product of $p$ vectors follow a tradeoff between statistical efficiency and computational acceleration.
We propose a simple Complex-to-Real (CtR) modification of well-known sketches that replaces real random projections by complex ones.
We show that our method achieves state-of-the-art performance in terms of accuracy and speed compared to other randomized approximations from the literature.
arXiv Detail & Related papers (2022-02-04T09:15:43Z) - Unfolding Projection-free SDP Relaxation of Binary Graph Classifier via
GDPA Linearization [59.87663954467815]
Algorithm unfolding creates an interpretable and parsimonious neural network architecture by implementing each iteration of a model-based algorithm as a neural layer.
In this paper, leveraging a recent linear algebraic theorem called Gershgorin disc perfect alignment (GDPA), we unroll a projection-free algorithm for semi-definite programming relaxation (SDR) of a binary graph.
Experimental results show that our unrolled network outperformed pure model-based graph classifiers, and achieved comparable performance to pure data-driven networks but using far fewer parameters.
arXiv Detail & Related papers (2021-09-10T07:01:15Z) - Beyond Lazy Training for Over-parameterized Tensor Decomposition [69.4699995828506]
We show that gradient descent on over-parametrized objective could go beyond the lazy training regime and utilize certain low-rank structure in the data.
Our results show that gradient descent on over-parametrized objective could go beyond the lazy training regime and utilize certain low-rank structure in the data.
arXiv Detail & Related papers (2020-10-22T00:32:12Z) - Streaming Coresets for Symmetric Tensor Factorization [9.181791777532608]
We show how to factorize tensors efficiently in the streaming setting.
We introduce two novel algorithmic techniques: online filtering and kernelization.
We show applications of our algorithms in learning single topic modeling.
arXiv Detail & Related papers (2020-06-01T19:55:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.