Dynamic Tensor Product Regression
- URL: http://arxiv.org/abs/2210.03961v1
- Date: Sat, 8 Oct 2022 08:06:00 GMT
- Title: Dynamic Tensor Product Regression
- Authors: Aravind Reddy, Zhao Song, Lichen Zhang
- Abstract summary: We present a dynamic tree data structure where any update to a single matrix can be propagated quickly.
We also show that our data structure can be used to solve dynamic versions of Product Regression, but also Product Spline regression.
- Score: 18.904243654649118
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: In this work, we initiate the study of \emph{Dynamic Tensor Product
Regression}. One has matrices $A_1\in \mathbb{R}^{n_1\times d_1},\ldots,A_q\in
\mathbb{R}^{n_q\times d_q}$ and a label vector $b\in \mathbb{R}^{n_1\ldots
n_q}$, and the goal is to solve the regression problem with the design matrix
$A$ being the tensor product of the matrices $A_1, A_2, \dots, A_q$ i.e.
$\min_{x\in \mathbb{R}^{d_1\ldots d_q}}~\|(A_1\otimes \ldots\otimes
A_q)x-b\|_2$. At each time step, one matrix $A_i$ receives a sparse change, and
the goal is to maintain a sketch of the tensor product $A_1\otimes\ldots
\otimes A_q$ so that the regression solution can be updated quickly.
Recomputing the solution from scratch for each round is very slow and so it is
important to develop algorithms which can quickly update the solution with the
new design matrix. Our main result is a dynamic tree data structure where any
update to a single matrix can be propagated quickly throughout the tree. We
show that our data structure can be used to solve dynamic versions of not only
Tensor Product Regression, but also Tensor Product Spline regression (which is
a generalization of ridge regression) and for maintaining Low Rank
Approximations for the tensor product.
Related papers
- Optimal Sketching for Residual Error Estimation for Matrix and Vector Norms [50.15964512954274]
We study the problem of residual error estimation for matrix and vector norms using a linear sketch.
We demonstrate that this gives a substantial advantage empirically, for roughly the same sketch size and accuracy as in previous work.
We also show an $Omega(k2/pn1-2/p)$ lower bound for the sparse recovery problem, which is tight up to a $mathrmpoly(log n)$ factor.
arXiv Detail & Related papers (2024-08-16T02:33:07Z) - Turnstile $\ell_p$ leverage score sampling with applications [56.403488578703865]
We develop a novel algorithm for sampling rows $a_i$ of a matrix $AinmathbbRntimes d$, proportional to their $ell_p$ norm, when $A$ is presented in a turnstile data stream.
Our algorithm not only returns the set of sampled row indexes, it also returns slightly perturbed rows $tildea_i approx a_i$, and approximates their sampling probabilities up to $varepsilon$ relative error.
For logistic regression, our framework yields the first algorithm that achieves a $
arXiv Detail & Related papers (2024-06-01T07:33:41Z) - How to Inverting the Leverage Score Distribution? [16.744561210470632]
Despite leverage scores being widely used as a tool, in this paper, we study a novel problem, namely the inverting leverage score.
We use iterative shrinking and the induction hypothesis to ensure global convergence rates for the Newton method.
This important study on inverting statistical leverage opens up numerous new applications in interpretation, data recovery, and security.
arXiv Detail & Related papers (2024-04-21T21:36:42Z) - Solving Attention Kernel Regression Problem via Pre-conditioner [9.131385887605935]
We design algorithms for two types of regression problems: $min_xin mathbbRd|(Atop A)jx-b|$ for any positive integer $j$.
The second proxy is applying exponential entrywise to the Gram matrix, denoted by $exp(AAtop)$ and solving the regression $min_xin mathbbRn|exp(AAtop)xb |$.
arXiv Detail & Related papers (2023-08-28T04:37:38Z) - A Nearly-Optimal Bound for Fast Regression with $\ell_\infty$ Guarantee [16.409210914237086]
Given a matrix $Ain mathbbRntimes d$ and a tensor $bin mathbbRn$, we consider the regression problem with $ell_infty$ guarantees.
We show that in order to obtain such $ell_infty$ guarantee for $ell$ regression, one has to use sketching matrices that are dense.
We also develop a novel analytical framework for $ell_infty$ guarantee regression that utilizes the Oblivious Coordinate-wise Embedding (OCE) property
arXiv Detail & Related papers (2023-02-01T05:22:40Z) - Optimal Query Complexities for Dynamic Trace Estimation [59.032228008383484]
We consider the problem of minimizing the number of matrix-vector queries needed for accurate trace estimation in the dynamic setting where our underlying matrix is changing slowly.
We provide a novel binary tree summation procedure that simultaneously estimates all $m$ traces up to $epsilon$ error with $delta$ failure probability.
Our lower bounds (1) give the first tight bounds for Hutchinson's estimator in the matrix-vector product model with Frobenius norm error even in the static setting, and (2) are the first unconditional lower bounds for dynamic trace estimation.
arXiv Detail & Related papers (2022-09-30T04:15:44Z) - Subquadratic Kronecker Regression with Applications to Tensor
Decomposition [4.391912559062643]
We present the first subquadratic-time algorithm for solving Kronecker regression to a $(varepsilon-N)$-approximation.
Our techniques combine leverage score sampling and iterative methods.
We demonstrate the speed and accuracy of this Kronecker regression algorithm on synthetic data and real-world image tensors.
arXiv Detail & Related papers (2022-09-11T14:24:19Z) - On the well-spread property and its relation to linear regression [4.619541348328937]
We show that consistent recovery of the parameter vector in a robust linear regression model is information-theoretically impossible.
We show that it is possible to efficiently certify whether a given $n$-by-$d$ matrix is well-spread if the number of observations is quadratic in the ambient dimension.
arXiv Detail & Related papers (2022-06-16T11:17:44Z) - Active Sampling for Linear Regression Beyond the $\ell_2$ Norm [70.49273459706546]
We study active sampling algorithms for linear regression, which aim to query only a small number of entries of a target vector.
We show that this dependence on $d$ is optimal, up to logarithmic factors.
We also provide the first total sensitivity upper bound $O(dmax1,p/2log2 n)$ for loss functions with at most degree $p$ growth.
arXiv Detail & Related papers (2021-11-09T00:20:01Z) - Statistical Query Lower Bounds for List-Decodable Linear Regression [55.06171096484622]
We study the problem of list-decodable linear regression, where an adversary can corrupt a majority of the examples.
Our main result is a Statistical Query (SQ) lower bound of $dmathrmpoly (1/alpha)$ for this problem.
arXiv Detail & Related papers (2021-06-17T17:45:21Z) - Learning a Latent Simplex in Input-Sparsity Time [58.30321592603066]
We consider the problem of learning a latent $k$-vertex simplex $KsubsetmathbbRdtimes n$, given access to $AinmathbbRdtimes n$.
We show that the dependence on $k$ in the running time is unnecessary given a natural assumption about the mass of the top $k$ singular values of $A$.
arXiv Detail & Related papers (2021-05-17T16:40:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.