Faster Binary Embeddings for Preserving Euclidean Distances
- URL: http://arxiv.org/abs/2010.00712v2
- Date: Wed, 10 Mar 2021 01:30:22 GMT
- Title: Faster Binary Embeddings for Preserving Euclidean Distances
- Authors: Jinjie Zhang, Rayan Saab
- Abstract summary: We propose a fast, distance-preserving, binary embedding algorithm to transform a dataset $mathcalTsubseteqmathbbRn$ into binary sequences in the cube $pm 1m$.
Our method is both fast and memory efficient, with time complexity $O(m)$ and space complexity $O(m)$.
- Score: 9.340611077939828
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose a fast, distance-preserving, binary embedding algorithm to
transform a high-dimensional dataset $\mathcal{T}\subseteq\mathbb{R}^n$ into
binary sequences in the cube $\{\pm 1\}^m$. When $\mathcal{T}$ consists of
well-spread (i.e., non-sparse) vectors, our embedding method applies a stable
noise-shaping quantization scheme to $A x$ where $A\in\mathbb{R}^{m\times n}$
is a sparse Gaussian random matrix. This contrasts with most binary embedding
methods, which usually use $x\mapsto \mathrm{sign}(Ax)$ for the embedding.
Moreover, we show that Euclidean distances among the elements of $\mathcal{T}$
are approximated by the $\ell_1$ norm on the images of $\{\pm 1\}^m$ under a
fast linear transformation. This again contrasts with standard methods, where
the Hamming distance is used instead. Our method is both fast and memory
efficient, with time complexity $O(m)$ and space complexity $O(m)$. Further, we
prove that the method is accurate and its associated error is comparable to
that of a continuous valued Johnson-Lindenstrauss embedding plus a quantization
error that admits a polynomial decay as the embedding dimension $m$ increases.
Thus the length of the binary codes required to achieve a desired accuracy is
quite small, and we show it can even be compressed further without compromising
the accuracy. To illustrate our results, we test the proposed method on natural
images and show that it achieves strong performance.
Related papers
- Optimal high-precision shadow estimation [22.01044188849049]
Formally we give a protocol that measures $O(log(m)/epsilon2)$ copies of an unknown mixed state $rhoinmathbbCdtimes d$.
We show via dimensionality reduction that we can rescale $epsilon$ and $d$ to reduce to the regime where $epsilon le O(d-1/2)$.
arXiv Detail & Related papers (2024-07-18T19:42:49Z) - Faster Linear Systems and Matrix Norm Approximation via Multi-level Sketched Preconditioning [10.690769339903941]
We present a new class of preconditioned iterative methods for solving linear systems of the form $Ax = b$.
Our methods are based on constructing a low-rank Nystr"om approximation to $A$ using sparse random sketching.
We prove that the convergence of our methods depends on a natural average condition number of $A$, which improves as the rank of the Nystr"om approximation.
arXiv Detail & Related papers (2024-05-09T15:53:43Z) - Provably learning a multi-head attention layer [55.2904547651831]
Multi-head attention layer is one of the key components of the transformer architecture that sets it apart from traditional feed-forward models.
In this work, we initiate the study of provably learning a multi-head attention layer from random examples.
We prove computational lower bounds showing that in the worst case, exponential dependence on $m$ is unavoidable.
arXiv Detail & Related papers (2024-02-06T15:39:09Z) - Solving Dense Linear Systems Faster Than via Preconditioning [1.8854491183340518]
We show that our algorithm has an $tilde O(n2)$ when $k=O(n0.729)$.
In particular, our algorithm has an $tilde O(n2)$ when $k=O(n0.729)$.
Our main algorithm can be viewed as a randomized block coordinate descent method.
arXiv Detail & Related papers (2023-12-14T12:53:34Z) - Fast $(1+\varepsilon)$-Approximation Algorithms for Binary Matrix
Factorization [54.29685789885059]
We introduce efficient $(1+varepsilon)$-approximation algorithms for the binary matrix factorization (BMF) problem.
The goal is to approximate $mathbfA$ as a product of low-rank factors.
Our techniques generalize to other common variants of the BMF problem.
arXiv Detail & Related papers (2023-06-02T18:55:27Z) - Learning a Single Neuron with Adversarial Label Noise via Gradient
Descent [50.659479930171585]
We study a function of the form $mathbfxmapstosigma(mathbfwcdotmathbfx)$ for monotone activations.
The goal of the learner is to output a hypothesis vector $mathbfw$ that $F(mathbbw)=C, epsilon$ with high probability.
arXiv Detail & Related papers (2022-06-17T17:55:43Z) - TURF: A Two-factor, Universal, Robust, Fast Distribution Learning
Algorithm [64.13217062232874]
One of its most powerful and successful modalities approximates every distribution to an $ell$ distance essentially at most a constant times larger than its closest $t$-piece degree-$d_$.
We provide a method that estimates this number near-optimally, hence helps approach the best possible approximation.
arXiv Detail & Related papers (2022-02-15T03:49:28Z) - Random matrices in service of ML footprint: ternary random features with
no performance loss [55.30329197651178]
We show that the eigenspectrum of $bf K$ is independent of the distribution of the i.i.d. entries of $bf w$.
We propose a novel random technique, called Ternary Random Feature (TRF)
The computation of the proposed random features requires no multiplication and a factor of $b$ less bits for storage compared to classical random features.
arXiv Detail & Related papers (2021-10-05T09:33:49Z) - Approximate Multiplication of Sparse Matrices with Limited Space [24.517908972536432]
We develop sparse co-occuring directions, which reduces the time complexity to $widetildeOleft((nnz(X)+nnz(Y))ell+nell2right)$ in expectation.
Theoretical analysis reveals that the approximation error of our algorithm is almost the same as that of COD.
arXiv Detail & Related papers (2020-09-08T05:39:19Z) - On Efficient Low Distortion Ultrametric Embedding [18.227854382422112]
A widely-used method to preserve the underlying hierarchical structure of the data is to find an embedding of the data into a tree or an ultrametric.
In this paper, we provide a new algorithm which takes as input a set of points isometric in $mathbbRd2 (for universal constant $rho>1$) to output an ultrametric $Delta.
We show that the output of our algorithm is comparable to the output of the linkage algorithms while achieving a much faster running time.
arXiv Detail & Related papers (2020-08-15T11:06:45Z) - Optimal Robust Linear Regression in Nearly Linear Time [97.11565882347772]
We study the problem of high-dimensional robust linear regression where a learner is given access to $n$ samples from the generative model $Y = langle X,w* rangle + epsilon$
We propose estimators for this problem under two settings: (i) $X$ is L4-L2 hypercontractive, $mathbbE [XXtop]$ has bounded condition number and $epsilon$ has bounded variance and (ii) $X$ is sub-Gaussian with identity second moment and $epsilon$ is
arXiv Detail & Related papers (2020-07-16T06:44:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.