Optimal Oblivious Subspace Embeddings with Near-optimal Sparsity
- URL: http://arxiv.org/abs/2411.08773v1
- Date: Wed, 13 Nov 2024 16:58:51 GMT
- Title: Optimal Oblivious Subspace Embeddings with Near-optimal Sparsity
- Authors: Shabarish Chenakkod, Michał Dereziński, Xiaoyu Dong,
- Abstract summary: An oblivious subspace embedding is a random $mtimes n$ matrix $Pi$ that preserves the norms of all vectors in that subspace within a $1pmepsilon$ factor.
We give an oblivious subspace embedding with the optimal dimension $m=Theta(d/epsilon2)$ that has a near-optimal sparsity of $tilde O (1/epsilon)$ non-zero entries per column of $Pi$.
- Score: 3.9657575162895196
- License:
- Abstract: An oblivious subspace embedding is a random $m\times n$ matrix $\Pi$ such that, for any $d$-dimensional subspace, with high probability $\Pi$ preserves the norms of all vectors in that subspace within a $1\pm\epsilon$ factor. In this work, we give an oblivious subspace embedding with the optimal dimension $m=\Theta(d/\epsilon^2)$ that has a near-optimal sparsity of $\tilde O(1/\epsilon)$ non-zero entries per column of $\Pi$. This is the first result to nearly match the conjecture of Nelson and Nguyen [FOCS 2013] in terms of the best sparsity attainable by an optimal oblivious subspace embedding, improving on a prior bound of $\tilde O(1/\epsilon^6)$ non-zeros per column [Chenakkod et al., STOC 2024]. We further extend our approach to the non-oblivious setting, proposing a new family of Leverage Score Sparsified embeddings with Independent Columns, which yield faster runtimes for matrix approximation and regression tasks. In our analysis, we develop a new method which uses a decoupling argument together with the cumulant method for bounding the edge universality error of isotropic random matrices. To achieve near-optimal sparsity, we combine this general-purpose approach with new traces inequalities that leverage the specific structure of our subspace embedding construction.
Related papers
- Optimal Sketching for Residual Error Estimation for Matrix and Vector Norms [50.15964512954274]
We study the problem of residual error estimation for matrix and vector norms using a linear sketch.
We demonstrate that this gives a substantial advantage empirically, for roughly the same sketch size and accuracy as in previous work.
We also show an $Omega(k2/pn1-2/p)$ lower bound for the sparse recovery problem, which is tight up to a $mathrmpoly(log n)$ factor.
arXiv Detail & Related papers (2024-08-16T02:33:07Z) - Misspecified $Q$-Learning with Sparse Linear Function Approximation: Tight Bounds on Approximation Error [25.777423855881878]
We show a novel elimination-based algorithm to show one can obtain an $Oleft(Hepsilonright)$-optimal policy.
We complement our upper bound with an $widetildeOmegaleft(Hepsilonright)$-optimality lower bound, giving a complete picture of this problem.
arXiv Detail & Related papers (2024-07-18T15:58:04Z) - Optimal Matrix Sketching over Sliding Windows [38.09464189820894]
Matrix sketching, aimed at approximating a matrix $boldsymbolA in mathbbRNtimes d$ consists of vector streams of length $N$ with a smaller sketching matrix $boldsymbolB in mathbbRelltimes d, ell ll N$.
We introduce the DS-FD algorithm, which achieves the optimal $Oleftfracdvarepsilonright)$ space bound for matrix sketching over row-normalized, sequence-based sliding windows
arXiv Detail & Related papers (2024-05-13T14:38:35Z) - Nearly Minimax Optimal Regret for Learning Linear Mixture Stochastic
Shortest Path [80.60592344361073]
We study the Shortest Path (SSP) problem with a linear mixture transition kernel.
An agent repeatedly interacts with a environment and seeks to reach certain goal state while minimizing the cumulative cost.
Existing works often assume a strictly positive lower bound of the iteration cost function or an upper bound of the expected length for the optimal policy.
arXiv Detail & Related papers (2024-02-14T07:52:00Z) - Optimal Embedding Dimension for Sparse Subspace Embeddings [4.042707434058959]
A random $mtimes n$ matrix $S$ is an oblivious subspace embedding (OSE)
We show that an $mtimes n$ random matrix $S$ with $mgeq (1+theta)d$ is an oblivious subspace embedding with $epsilon = O_theta(1)$.
We use this to construct the first oblivious subspace embedding with $O(d)$ embedding dimension that can be applied faster than current matrix multiplication time.
arXiv Detail & Related papers (2023-11-17T18:01:58Z) - On the Interplay Between Misspecification and Sub-optimality Gap in
Linear Contextual Bandits [76.2262680277608]
We study linear contextual bandits in the misspecified setting, where the expected reward function can be approximated by a linear function class.
We show that our algorithm enjoys the same gap-dependent regret bound $tilde O (d2/Delta)$ as in the well-specified setting up to logarithmic factors.
arXiv Detail & Related papers (2023-03-16T15:24:29Z) - Stochastic Linear Bandits with Protected Subspace [51.43660657268171]
We study a variant of the linear bandit problem wherein we optimize a linear objective function but rewards are accrued only to an unknown subspace.
In particular, at each round, the learner must choose whether to query the objective or the protected subspace alongside choosing an action.
Our algorithm, derived from the OFUL principle, uses some of the queries to get an estimate of the protected space.
arXiv Detail & Related papers (2020-11-02T14:59:39Z) - Deep Learning Meets Projective Clustering [66.726500395069]
A common approach for compressing NLP networks is to encode the embedding layer as a matrix $AinmathbbRntimes d$.
Inspired by emphprojective clustering from computational geometry, we suggest replacing this subspace by a set of $k$ subspaces.
arXiv Detail & Related papers (2020-10-08T22:47:48Z) - Subspace approximation with outliers [6.186553186139257]
We show how to extend dimension reduction techniques and bi-criteria approximations based on sampling to the problem of subspace approximation with outliers.
Our results hold even when the fraction of outliers $alpha$ is large, as long as the obvious condition $0 delta leq 1 - alpha$ is satisfied.
arXiv Detail & Related papers (2020-06-30T07:22:33Z) - Maximizing Determinants under Matroid Constraints [69.25768526213689]
We study the problem of finding a basis $S$ of $M$ such that $det(sum_i in Sv_i v_i v_itop)$ is maximized.
This problem appears in a diverse set of areas such as experimental design, fair allocation of goods, network design, and machine learning.
arXiv Detail & Related papers (2020-04-16T19:16:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.