L1 Regression with Lewis Weights Subsampling
- URL: http://arxiv.org/abs/2105.09433v1
- Date: Wed, 19 May 2021 23:15:00 GMT
- Title: L1 Regression with Lewis Weights Subsampling
- Authors: Aditya Parulekar, Advait Parulekar, Eric Price
- Abstract summary: We show that sampling from $X$ according to its Lewis weights and outputting the empirical minimizer succeeds with probability $1-delta$ for $m.
We also give a corresponding lower bound of $Omega(fracdvarepsilon2 + (d + frac1varepsilon2) logfrac1delta)$.
- Score: 10.796388585158184
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We consider the problem of finding an approximate solution to $\ell_1$
regression while only observing a small number of labels. Given an $n \times d$
unlabeled data matrix $X$, we must choose a small set of $m \ll n$ rows to
observe the labels of, then output an estimate $\widehat{\beta}$ whose error on
the original problem is within a $1 + \varepsilon$ factor of optimal. We show
that sampling from $X$ according to its Lewis weights and outputting the
empirical minimizer succeeds with probability $1-\delta$ for $m >
O(\frac{1}{\varepsilon^2} d \log \frac{d}{\varepsilon \delta})$. This is
analogous to the performance of sampling according to leverage scores for
$\ell_2$ regression, but with exponentially better dependence on $\delta$. We
also give a corresponding lower bound of $\Omega(\frac{d}{\varepsilon^2} + (d +
\frac{1}{\varepsilon^2}) \log\frac{1}{\delta})$.
Related papers
- Coresets for Multiple $\ell_p$ Regression [47.790126028106734]
We construct coresets of size $tilde O(varepsilon-2d)$ for $p2$ and $tilde O(varepsilon-pdp/2)$ for $p>2$.
For $1p2$, every matrix has a subset of $tilde O(varepsilon-1k)$ rows which spans a $(varepsilon-1k)$-approximately optimal $k$-dimensional subspace for $ell_p$ subspace approximation
arXiv Detail & Related papers (2024-06-04T15:50:42Z) - Distribution-Independent Regression for Generalized Linear Models with
Oblivious Corruptions [49.69852011882769]
We show the first algorithms for the problem of regression for generalized linear models (GLMs) in the presence of additive oblivious noise.
We present an algorithm that tackles newthis problem in its most general distribution-independent setting.
This is the first newalgorithmic result for GLM regression newwith oblivious noise which can handle more than half the samples being arbitrarily corrupted.
arXiv Detail & Related papers (2023-09-20T21:41:59Z) - $\ell_p$-Regression in the Arbitrary Partition Model of Communication [59.89387020011663]
We consider the randomized communication complexity of the distributed $ell_p$-regression problem in the coordinator model.
For $p = 2$, i.e., least squares regression, we give the first optimal bound of $tildeTheta(sd2 + sd/epsilon)$ bits.
For $p in (1,2)$,we obtain an $tildeO(sd2/epsilon + sd/mathrmpoly(epsilon)$ upper bound.
arXiv Detail & Related papers (2023-07-11T08:51:53Z) - Optimal Sketching Bounds for Sparse Linear Regression [116.30196615349226]
We study oblivious sketching for $k$-sparse linear regression under various loss functions such as an $ell_p$ norm, or from a broad class of hinge-like loss functions.
We show that for sparse $ell$vareps regression, there is an oblivious distribution over sketches with $Theta(klog(d/k)/varepsilon2)$ rows, which is tight up to a constant factor.
We also show that sketching $O(mu2 klog(mu n d/varepsilon)/var
arXiv Detail & Related papers (2023-04-05T07:24:19Z) - Almost Linear Constant-Factor Sketching for $\ell_1$ and Logistic
Regression [74.28017932704704]
We improve upon previous oblivious sketching and turnstile streaming results for $ell_1$ and logistic regression.
We also give a tradeoff that yields a $1+varepsilon$ approximation in input sparsity time.
Our sketch can be extended to approximate a regularized version of logistic regression where the data-dependent regularizer corresponds to the variance of the individual logistic losses.
arXiv Detail & Related papers (2023-03-31T18:12:33Z) - A spectral least-squares-type method for heavy-tailed corrupted
regression with unknown covariance \& heterogeneous noise [2.019622939313173]
We revisit heavy-tailed corrupted least-squares linear regression assuming to have a corrupted $n$-sized label-feature sample of at most $epsilon n$ arbitrary outliers.
We propose a near-optimal computationally tractable estimator, based on the power method, assuming no knowledge on $(Sigma,Xi) nor the operator norm of $Xi$.
arXiv Detail & Related papers (2022-09-06T23:37:31Z) - Online Lewis Weight Sampling [62.38157566916501]
Cohen and Peng introduced Lewis weight sampling to the theoretical computer science community.
Several works have extended this important primitive to other settings, including the online coreset, sliding window, and adversarial streaming models.
We design the first nearly optimal $ell_p$ subspace embeddings for all $pin(0,infty)$ in the online coreset, sliding window, and the adversarial streaming models.
arXiv Detail & Related papers (2022-07-17T19:40:51Z) - Robust Testing in High-Dimensional Sparse Models [0.0]
We consider the problem of robustly testing the norm of a high-dimensional sparse signal vector under two different observation models.
We show that any algorithm that reliably tests the norm of the regression coefficient requires at least $n=Omegaleft(min(slog d,1/gamma4)right) samples.
arXiv Detail & Related papers (2022-05-16T07:47:22Z) - Active Sampling for Linear Regression Beyond the $\ell_2$ Norm [70.49273459706546]
We study active sampling algorithms for linear regression, which aim to query only a small number of entries of a target vector.
We show that this dependence on $d$ is optimal, up to logarithmic factors.
We also provide the first total sensitivity upper bound $O(dmax1,p/2log2 n)$ for loss functions with at most degree $p$ growth.
arXiv Detail & Related papers (2021-11-09T00:20:01Z) - Efficient Statistics for Sparse Graphical Models from Truncated Samples [19.205541380535397]
We focus on two fundamental and classical problems: (i) inference of sparse Gaussian graphical models and (ii) support recovery of sparse linear models.
For sparse linear regression, suppose samples $(bf x,y)$ are generated where $y = bf xtopOmega* + mathcalN(0,1)$ and $(bf x, y)$ is seen only if $y$ belongs to a truncation set $S subseteq mathbbRd$.
arXiv Detail & Related papers (2020-06-17T09:21:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.