Optimal bounds for $\ell_p$ sensitivity sampling via $\ell_2$ augmentation
- URL: http://arxiv.org/abs/2406.00328v1
- Date: Sat, 1 Jun 2024 07:03:40 GMT
- Title: Optimal bounds for $\ell_p$ sensitivity sampling via $\ell_2$ augmentation
- Authors: Alexander Munteanu, Simon Omlor,
- Abstract summary: We show that by augmenting the $ell_p$ sensitivities by $ell$ sensitivities, we obtain better bounds on optimal linear $tilde O(varepsilon-2mu2 d)$ sampling complexity.
We also obtain an $tilde O(varepsilon-2mu2 d)$ sensitivity sampling bound for logistic regression.
- Score: 56.403488578703865
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Data subsampling is one of the most natural methods to approximate a massively large data set by a small representative proxy. In particular, sensitivity sampling received a lot of attention, which samples points proportional to an individual importance measure called sensitivity. This framework reduces in very general settings the size of data to roughly the VC dimension $d$ times the total sensitivity $\mathfrak S$ while providing strong $(1\pm\varepsilon)$ guarantees on the quality of approximation. The recent work of Woodruff & Yasuda (2023c) improved substantially over the general $\tilde O(\varepsilon^{-2}\mathfrak Sd)$ bound for the important problem of $\ell_p$ subspace embeddings to $\tilde O(\varepsilon^{-2}\mathfrak S^{2/p})$ for $p\in[1,2]$. Their result was subsumed by an earlier $\tilde O(\varepsilon^{-2}\mathfrak Sd^{1-p/2})$ bound which was implicitly given in the work of Chen & Derezinski (2021). We show that their result is tight when sampling according to plain $\ell_p$ sensitivities. We observe that by augmenting the $\ell_p$ sensitivities by $\ell_2$ sensitivities, we obtain better bounds improving over the aforementioned results to optimal linear $\tilde O(\varepsilon^{-2}(\mathfrak S+d)) = \tilde O(\varepsilon^{-2}d)$ sampling complexity for all $p \in [1,2]$. In particular, this resolves an open question of Woodruff & Yasuda (2023c) in the affirmative for $p \in [1,2]$ and brings sensitivity subsampling into the regime that was previously only known to be possible using Lewis weights (Cohen & Peng, 2015). As an application of our main result, we also obtain an $\tilde O(\varepsilon^{-2}\mu d)$ sensitivity sampling bound for logistic regression, where $\mu$ is a natural complexity measure for this problem. This improves over the previous $\tilde O(\varepsilon^{-2}\mu^2 d)$ bound of Mai et al. (2021) which was based on Lewis weights subsampling.
Related papers
- Coresets for Multiple $\ell_p$ Regression [47.790126028106734]
We construct coresets of size $tilde O(varepsilon-2d)$ for $p2$ and $tilde O(varepsilon-pdp/2)$ for $p>2$.
For $1p2$, every matrix has a subset of $tilde O(varepsilon-1k)$ rows which spans a $(varepsilon-1k)$-approximately optimal $k$-dimensional subspace for $ell_p$ subspace approximation
arXiv Detail & Related papers (2024-06-04T15:50:42Z) - $\ell_p$-Regression in the Arbitrary Partition Model of Communication [59.89387020011663]
We consider the randomized communication complexity of the distributed $ell_p$-regression problem in the coordinator model.
For $p = 2$, i.e., least squares regression, we give the first optimal bound of $tildeTheta(sd2 + sd/epsilon)$ bits.
For $p in (1,2)$,we obtain an $tildeO(sd2/epsilon + sd/mathrmpoly(epsilon)$ upper bound.
arXiv Detail & Related papers (2023-07-11T08:51:53Z) - Sharper Bounds for $\ell_p$ Sensitivity Sampling [56.45770306797584]
We show the first bounds for sensitivity sampling for $ell_p$ subspace embeddings for $p.
We also show that the root leverage score sampling algorithm achieves a bound of roughly $d$ for $1leq p2$, and that a combination of leverage score and sensitivity sampling achieves an improved bound of roughly $d2/pmathfrak S2-4/p$ for $2pinfty.
arXiv Detail & Related papers (2023-06-01T14:27:28Z) - Active Sampling for Linear Regression Beyond the $\ell_2$ Norm [70.49273459706546]
We study active sampling algorithms for linear regression, which aim to query only a small number of entries of a target vector.
We show that this dependence on $d$ is optimal, up to logarithmic factors.
We also provide the first total sensitivity upper bound $O(dmax1,p/2log2 n)$ for loss functions with at most degree $p$ growth.
arXiv Detail & Related papers (2021-11-09T00:20:01Z) - The Price of Tolerance in Distribution Testing [31.10049510641336]
We show the sample complexity to be [fracsqrtnvarepsilon2 + fracnlog n cdotmaxleftfracvarepsilon2, providing a smooth tradeoff between the two previously known cases.
We also provide a similar characterization for the problem of tolerant equivalence testing, where both $p$ and $q$ are unknown.
arXiv Detail & Related papers (2021-06-25T03:59:42Z) - Model-Free Reinforcement Learning: from Clipped Pseudo-Regret to Sample
Complexity [59.34067736545355]
Given an MDP with $S$ states, $A$ actions, the discount factor $gamma in (0,1)$, and an approximation threshold $epsilon > 0$, we provide a model-free algorithm to learn an $epsilon$-optimal policy.
For small enough $epsilon$, we show an improved algorithm with sample complexity.
arXiv Detail & Related papers (2020-06-06T13:34:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.