Related papers: Streaming Coresets for Symmetric Tensor Factorization

Streaming Coresets for Symmetric Tensor Factorization

URL: http://arxiv.org/abs/2006.01225v2
Date: Mon, 13 Jul 2020 17:49:46 GMT
Title: Streaming Coresets for Symmetric Tensor Factorization
Authors: Rachit Chhaya, Jayesh Choudhari, Anirban Dasgupta, Supratim Shit
Abstract summary: We show how to factorize tensors efficiently in the streaming setting. We introduce two novel algorithmic techniques: online filtering and kernelization. We show applications of our algorithms in learning single topic modeling.
Score: 9.181791777532608
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Factorizing tensors has recently become an important optimization module in a number of machine learning pipelines, especially in latent variable models. We show how to do this efficiently in the streaming setting. Given a set of $n$ vectors, each in $\mathbb{R}^d$, we present algorithms to select a sublinear number of these vectors as coreset, while guaranteeing that the CP decomposition of the $p$-moment tensor of the coreset approximates the corresponding decomposition of the $p$-moment tensor computed from the full data. We introduce two novel algorithmic techniques: online filtering and kernelization. Using these two, we present six algorithms that achieve different tradeoffs of coreset size, update time and working space, beating or matching various state of the art algorithms. In the case of matrices ($2$-ordered tensor), our online row sampling algorithm guarantees $(1 \pm \epsilon)$ relative error spectral approximation. We show applications of our algorithms in learning single topic modeling.

Related papers

Implicit High-Order Moment Tensor Estimation and Learning Latent Variable Models [39.33814194788341]
We study the task of learning latent-variable models. Motivated by such applications, we develop a general efficient algorithm for implicit moment computation. By leveraging our general algorithm, we obtain the first-time learners for the following models.
arXiv Detail & Related papers (2024-11-23T23:13:24Z)
Efficiently Learning One-Hidden-Layer ReLU Networks via Schur Polynomials [50.90125395570797]
We study the problem of PAC learning a linear combination of $k$ ReLU activations under the standard Gaussian distribution on $mathbbRd$ with respect to the square loss. Our main result is an efficient algorithm for this learning task with sample and computational complexity $(dk/epsilon)O(k)$, whereepsilon>0$ is the target accuracy.
arXiv Detail & Related papers (2023-07-24T14:37:22Z)
Efficient Dataset Distillation Using Random Feature Approximation [109.07737733329019]
We propose a novel algorithm that uses a random feature approximation (RFA) of the Neural Network Gaussian Process (NNGP) kernel. Our algorithm provides at least a 100-fold speedup over KIP and can run on a single GPU. Our new method, termed an RFA Distillation (RFAD), performs competitively with KIP and other dataset condensation algorithms in accuracy over a range of large-scale datasets.
arXiv Detail & Related papers (2022-10-21T15:56:13Z)
Near-Linear Time and Fixed-Parameter Tractable Algorithms for Tensor Decompositions [51.19236668224547]
We study low rank approximation of tensors, focusing on the tensor train and Tucker decompositions. For tensor train decomposition, we give a bicriteria $(1 + eps)$-approximation algorithm with a small bicriteria rank and $O(q cdot nnz(A))$ running time. In addition, we extend our algorithm to tensor networks with arbitrary graphs.
arXiv Detail & Related papers (2022-07-15T11:55:09Z)
Batch-efficient EigenDecomposition for Small and Medium Matrices [65.67315418971688]
EigenDecomposition (ED) is at the heart of many computer vision algorithms and applications. We propose a QR-based ED method dedicated to the application scenarios of computer vision.
arXiv Detail & Related papers (2022-07-09T09:14:12Z)
Cost-efficient Gaussian Tensor Network Embeddings for Tensor-structured Inputs [2.737640280995564]
We use network embeddings to perform dimensionality reduction of tensor network structured inputs $x$. We provide an algorithm to efficiently sketch input data using such embeddings. We show optimality of an existing algorithm for train rounding.
arXiv Detail & Related papers (2022-05-26T05:27:31Z)
Leverage Score Sampling for Tensor Product Matrices in Input Sparsity Time [54.65688986250061]
We give an input sparsity time sampling algorithm for approximating the Gram matrix corresponding to the $q$-fold column-wise tensor product of $q$ matrices. Our sampling technique relies on a collection of $q$ partially correlated random projections which can be simultaneously applied to a dataset $X$ in total time.
arXiv Detail & Related papers (2022-02-09T15:26:03Z)
Clustering Mixture Models in Almost-Linear Time via List-Decodable Mean Estimation [58.24280149662003]
We study the problem of list-decodable mean estimation, where an adversary can corrupt a majority of the dataset. We develop new algorithms for list-decodable mean estimation, achieving nearly-optimal statistical guarantees.
arXiv Detail & Related papers (2021-06-16T03:34:14Z)
Reconstruction Algorithms for Low-Rank Tensors and Depth-3 Multilinear Circuits [4.129484350382891]
We give new and efficient black-box reconstruction algorithms for some classes of depth$3$ arithmetic circuits. Our algorithm works over all fields characteristic 0 or large enough characteristic.
arXiv Detail & Related papers (2021-05-04T20:45:07Z)
Fast Tensor Disentangling Algorithm [0.0]
We introduce an approximate, fast, and simple algorithm to optimize disentangling unitary tensors. Our algorithm is faster than previous iterative algorithms and often results in a residual entanglement entropy that is within 10 to 40% of the minimum.
arXiv Detail & Related papers (2021-04-16T18:00:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.