Terminal Embeddings in Sublinear Time
- URL: http://arxiv.org/abs/2110.08691v3
- Date: Wed, 13 Mar 2024 04:45:31 GMT
- Title: Terminal Embeddings in Sublinear Time
- Authors: Yeshwanth Cherapanamjeri, Jelani Nelson
- Abstract summary: We show how to pre-process $T$ to obtain an almost linear-space data structure that supports computing the terminal embedding image of any $qinmathbbRd$ in sublinear time.
Our main contribution in this work is to give a new data structure for computing terminal embeddings.
- Score: 14.959896180728832
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently (Elkin, Filtser, Neiman 2017) introduced the concept of a {\it
terminal embedding} from one metric space $(X,d_X)$ to another $(Y,d_Y)$ with a
set of designated terminals $T\subset X$. Such an embedding $f$ is said to have
distortion $\rho\ge 1$ if $\rho$ is the smallest value such that there exists a
constant $C>0$ satisfying
\begin{equation*}
\forall x\in T\ \forall q\in X,\ C d_X(x, q) \le d_Y(f(x), f(q)) \le C \rho
d_X(x, q) .
\end{equation*}
When $X,Y$ are both Euclidean metrics with $Y$ being $m$-dimensional,
recently (Narayanan, Nelson 2019), following work of (Mahabadi, Makarychev,
Makarychev, Razenshteyn 2018), showed that distortion $1+\epsilon$ is
achievable via such a terminal embedding with $m = O(\epsilon^{-2}\log n)$ for
$n := |T|$. This generalizes the Johnson-Lindenstrauss lemma, which only
preserves distances within $T$ and not to $T$ from the rest of space. The
downside of prior work is that evaluating their embedding on some $q\in
\mathbb{R}^d$ required solving a semidefinite program with $\Theta(n)$
constraints in~$m$ variables and thus required some superlinear
$\mathrm{poly}(n)$ runtime. Our main contribution in this work is to give a new
data structure for computing terminal embeddings. We show how to pre-process
$T$ to obtain an almost linear-space data structure that supports computing the
terminal embedding image of any $q\in\mathbb{R}^d$ in sublinear time $O^*
(n^{1-\Theta(\epsilon^2)} + d)$. To accomplish this, we leverage tools
developed in the context of approximate nearest neighbor search.
Related papers
- Sparsifying Suprema of Gaussian Processes [6.638504164134713]
We show that there is an $O_varepsilon(1)$-size subset $S subseteq T$ and a set of real values $c_s_s in S$.
We also use our sparsification result for suprema of centered Gaussian processes to give a sparsification lemma for convex sets of bounded geometric width.
arXiv Detail & Related papers (2024-11-22T01:43:58Z) - LevAttention: Time, Space, and Streaming Efficient Algorithm for Heavy Attentions [54.54897832889028]
We show that for any $K$, there is a universal set" $U subset [n]$ of size independent of $n$, such that for any $Q$ and any row $i$, the large attention scores $A_i,j$ in row $i$ of $A$ all have $jin U$.
We empirically show the benefits of our scheme for vision transformers, showing how to train new models that use our universal set while training as well.
arXiv Detail & Related papers (2024-10-07T19:47:13Z) - Optimal Sketching for Residual Error Estimation for Matrix and Vector Norms [50.15964512954274]
We study the problem of residual error estimation for matrix and vector norms using a linear sketch.
We demonstrate that this gives a substantial advantage empirically, for roughly the same sketch size and accuracy as in previous work.
We also show an $Omega(k2/pn1-2/p)$ lower bound for the sparse recovery problem, which is tight up to a $mathrmpoly(log n)$ factor.
arXiv Detail & Related papers (2024-08-16T02:33:07Z) - Optimal Embedding Dimension for Sparse Subspace Embeddings [4.042707434058959]
A random $mtimes n$ matrix $S$ is an oblivious subspace embedding (OSE)
We show that an $mtimes n$ random matrix $S$ with $mgeq (1+theta)d$ is an oblivious subspace embedding with $epsilon = O_theta(1)$.
We use this to construct the first oblivious subspace embedding with $O(d)$ embedding dimension that can be applied faster than current matrix multiplication time.
arXiv Detail & Related papers (2023-11-17T18:01:58Z) - $\ell_p$-Regression in the Arbitrary Partition Model of Communication [59.89387020011663]
We consider the randomized communication complexity of the distributed $ell_p$-regression problem in the coordinator model.
For $p = 2$, i.e., least squares regression, we give the first optimal bound of $tildeTheta(sd2 + sd/epsilon)$ bits.
For $p in (1,2)$,we obtain an $tildeO(sd2/epsilon + sd/mathrmpoly(epsilon)$ upper bound.
arXiv Detail & Related papers (2023-07-11T08:51:53Z) - A Nearly-Optimal Bound for Fast Regression with $\ell_\infty$ Guarantee [16.409210914237086]
Given a matrix $Ain mathbbRntimes d$ and a tensor $bin mathbbRn$, we consider the regression problem with $ell_infty$ guarantees.
We show that in order to obtain such $ell_infty$ guarantee for $ell$ regression, one has to use sketching matrices that are dense.
We also develop a novel analytical framework for $ell_infty$ guarantee regression that utilizes the Oblivious Coordinate-wise Embedding (OCE) property
arXiv Detail & Related papers (2023-02-01T05:22:40Z) - Low-Rank Approximation with $1/\epsilon^{1/3}$ Matrix-Vector Products [58.05771390012827]
We study iterative methods based on Krylov subspaces for low-rank approximation under any Schatten-$p$ norm.
Our main result is an algorithm that uses only $tildeO(k/sqrtepsilon)$ matrix-vector products.
arXiv Detail & Related papers (2022-02-10T16:10:41Z) - Approximate Maximum Halfspace Discrepancy [6.35821487778241]
We consider the range space $(X, mathcalH_d)$ where $X subset mathbbRd$ and $mathcalH_d$ is the set of ranges defined by $d$ halfspaces.
For each halfspace $h in mathcalH_d$ define a function $Phi(h)$ that measures the "difference" between the fraction of red and fraction of blue points which fall in the range $h$.
arXiv Detail & Related papers (2021-06-25T19:14:45Z) - The planted matching problem: Sharp threshold and infinite-order phase
transition [25.41713098167692]
We study the problem of reconstructing a perfect matching $M*$ hidden in a randomly weighted $ntimes n$ bipartite graph.
We show that if $sqrtd B(mathcalP,mathcalQ) ge 1+epsilon$ for an arbitrarily small constant $epsilon>0$, the reconstruction error for any estimator is shown to be bounded away from $0$.
arXiv Detail & Related papers (2021-03-17T00:59:33Z) - The Average-Case Time Complexity of Certifying the Restricted Isometry
Property [66.65353643599899]
In compressed sensing, the restricted isometry property (RIP) on $M times N$ sensing matrices guarantees efficient reconstruction of sparse vectors.
We investigate the exact average-case time complexity of certifying the RIP property for $Mtimes N$ matrices with i.i.d. $mathcalN(0,1/M)$ entries.
arXiv Detail & Related papers (2020-05-22T16:55:01Z) - On the Complexity of Minimizing Convex Finite Sums Without Using the
Indices of the Individual Functions [62.01594253618911]
We exploit the finite noise structure of finite sums to derive a matching $O(n2)$-upper bound under the global oracle model.
Following a similar approach, we propose a novel adaptation of SVRG which is both emphcompatible with oracles, and achieves complexity bounds of $tildeO(n2+nsqrtL/mu)log (1/epsilon)$ and $O(nsqrtL/epsilon)$, for $mu>0$ and $mu=0$
arXiv Detail & Related papers (2020-02-09T03:39:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.