On Coresets For Regularized Regression
- URL: http://arxiv.org/abs/2006.05440v3
- Date: Tue, 30 Jun 2020 08:43:29 GMT
- Title: On Coresets For Regularized Regression
- Authors: Rachit Chhaya, Anirban Dasgupta, Supratim Shit
- Abstract summary: We analyze the size of coresets for regularized versions of regression of the form $|mathbfAx-mathbfb|_pr + lambda|mathbfx|_qs$.
We show that when $r neq s$, no coreset for regularized regression can have size smaller than the optimal coreset of the unregularized version.
- Score: 8.965836729525394
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study the effect of norm based regularization on the size of coresets for
regression problems. Specifically, given a matrix $ \mathbf{A} \in
{\mathbb{R}}^{n \times d}$ with $n\gg d$ and a vector $\mathbf{b} \in
\mathbb{R} ^ n $ and $\lambda > 0$, we analyze the size of coresets for
regularized versions of regression of the form $\|\mathbf{Ax}-\mathbf{b}\|_p^r
+ \lambda\|{\mathbf{x}}\|_q^s$ . Prior work has shown that for ridge regression
(where $p,q,r,s=2$) we can obtain a coreset that is smaller than the coreset
for the unregularized counterpart i.e. least squares regression (Avron et al).
We show that when $r \neq s$, no coreset for regularized regression can have
size smaller than the optimal coreset of the unregularized version. The well
known lasso problem falls under this category and hence does not allow a
coreset smaller than the one for least squares regression. We propose a
modified version of the lasso problem and obtain for it a coreset of size
smaller than the least square regression. We empirically show that the modified
version of lasso also induces sparsity in solution, similar to the original
lasso. We also obtain smaller coresets for $\ell_p$ regression with $\ell_p$
regularization. We extend our methods to multi response regularized regression.
Finally, we empirically demonstrate the coreset performance for the modified
lasso and the $\ell_1$ regression with $\ell_1$ regularization.
Related papers
- Accurate Coresets for Latent Variable Models and Regularized Regression [1.9567015559455132]
We introduce a unified framework for constructing accurate coresets.
We present accurate coreset construction algorithms for general problems.
We substantiate our theoretical findings with extensive experimental evaluations on real datasets.
arXiv Detail & Related papers (2024-12-28T16:01:49Z) - Conditional regression for the Nonlinear Single-Variable Model [4.565636963872865]
We consider a model $F(X):=f(Pi_gamma):mathbbRdto[0,rmlen_gamma]$ where $Pi_gamma: [0,rmlen_gamma]tomathbbRd$ and $f:[0,rmlen_gamma]tomathbbR1$.
We propose a nonparametric estimator, based on conditional regression, and show that it can achieve the $one$-dimensional optimal min-max rate
arXiv Detail & Related papers (2024-11-14T18:53:51Z) - A Unified Framework for Uniform Signal Recovery in Nonlinear Generative
Compressed Sensing [68.80803866919123]
Under nonlinear measurements, most prior results are non-uniform, i.e., they hold with high probability for a fixed $mathbfx*$ rather than for all $mathbfx*$ simultaneously.
Our framework accommodates GCS with 1-bit/uniformly quantized observations and single index models as canonical examples.
We also develop a concentration inequality that produces tighter bounds for product processes whose index sets have low metric entropy.
arXiv Detail & Related papers (2023-09-25T17:54:19Z) - Distribution-Independent Regression for Generalized Linear Models with
Oblivious Corruptions [49.69852011882769]
We show the first algorithms for the problem of regression for generalized linear models (GLMs) in the presence of additive oblivious noise.
We present an algorithm that tackles newthis problem in its most general distribution-independent setting.
This is the first newalgorithmic result for GLM regression newwith oblivious noise which can handle more than half the samples being arbitrarily corrupted.
arXiv Detail & Related papers (2023-09-20T21:41:59Z) - In-Context Learning for Attention Scheme: from Single Softmax Regression
to Multiple Softmax Regression via a Tensor Trick [15.090593955414137]
We consider the in-context learning under two formulation for attention related regression in this work.
Our regression problem shares similarities with previous studies on softmax-related regression.
arXiv Detail & Related papers (2023-07-05T16:41:01Z) - Optimal Sketching Bounds for Sparse Linear Regression [116.30196615349226]
We study oblivious sketching for $k$-sparse linear regression under various loss functions such as an $ell_p$ norm, or from a broad class of hinge-like loss functions.
We show that for sparse $ell$vareps regression, there is an oblivious distribution over sketches with $Theta(klog(d/k)/varepsilon2)$ rows, which is tight up to a constant factor.
We also show that sketching $O(mu2 klog(mu n d/varepsilon)/var
arXiv Detail & Related papers (2023-04-05T07:24:19Z) - Private Isotonic Regression [54.32252900997422]
We consider the problem of isotonic regression over a partially ordered set (poset) $mathcalX$ and for any Lipschitz loss function.
We obtain a pure-DP algorithm that has an expected excess empirical risk of roughly $mathrmwidth(mathcalX) cdot log|mathcalX| / n$, where $mathrmwidth(mathcalX)$ is the width of the poset.
We show that the bounds above are essentially the best that can be
arXiv Detail & Related papers (2022-10-27T05:08:07Z) - Hardness and Algorithms for Robust and Sparse Optimization [17.842787715567436]
We explore algorithms and limitations for sparse optimization problems such as sparse linear regression and robust linear regression.
Specifically, the sparse linear regression problem seeks a $k$-sparse vector $xinmathbbRd$ to minimize $|Ax-b|$.
The robust linear regression problem seeks a set $S$ that ignores at most $k$ rows and a vector $x$ to minimize $|(Ax-b)_S|$.
arXiv Detail & Related papers (2022-06-29T01:40:38Z) - Active Sampling for Linear Regression Beyond the $\ell_2$ Norm [70.49273459706546]
We study active sampling algorithms for linear regression, which aim to query only a small number of entries of a target vector.
We show that this dependence on $d$ is optimal, up to logarithmic factors.
We also provide the first total sensitivity upper bound $O(dmax1,p/2log2 n)$ for loss functions with at most degree $p$ growth.
arXiv Detail & Related papers (2021-11-09T00:20:01Z) - Statistical Query Lower Bounds for List-Decodable Linear Regression [55.06171096484622]
We study the problem of list-decodable linear regression, where an adversary can corrupt a majority of the examples.
Our main result is a Statistical Query (SQ) lower bound of $dmathrmpoly (1/alpha)$ for this problem.
arXiv Detail & Related papers (2021-06-17T17:45:21Z) - Truncated Linear Regression in High Dimensions [26.41623833920794]
In truncated linear regression, $(A_i, y_i)_i$ whose dependent variable equals $y_i= A_irm T cdot x* + eta_i$ is some fixed unknown vector of interest.
The goal is to recover $x*$ under some favorable conditions on the $A_i$'s and the noise distribution.
We prove that there exists a computationally and statistically efficient method for recovering $k$-sparse $n$-dimensional vectors $x*$ from $m$ truncated samples.
arXiv Detail & Related papers (2020-07-29T00:31:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.