Related papers: Reduced Label Complexity For Tight $\ell

Reduced Label Complexity For Tight $\ell_2$ Regression

URL: http://arxiv.org/abs/2305.07486v1
Date: Fri, 12 May 2023 13:56:33 GMT
Title: Reduced Label Complexity For Tight $\ell_2$ Regression
Authors: Alex Gittens and Malik Magdon-Ismail
Abstract summary: Given data $rm XinmathbbRntimes d$ and labels $mathbfyinmathbbRn$ the goal is find $mathbfw-mathbfyVert2$. We give a algorithm that, emphoblivious to $mathbfy$, throws out $n/(d+sqrtn)$ data points and is a $(1+d/n)$-approximation to optimal in expectation
Score: 12.141102261633746
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Given data ${\rm X}\in\mathbb{R}^{n\times d}$ and labels $\mathbf{y}\in\mathbb{R}^{n}$ the goal is find $\mathbf{w}\in\mathbb{R}^d$ to minimize $\Vert{\rm X}\mathbf{w}-\mathbf{y}\Vert^2$. We give a polynomial algorithm that, \emph{oblivious to $\mathbf{y}$}, throws out $n/(d+\sqrt{n})$ data points and is a $(1+d/n)$-approximation to optimal in expectation. The motivation is tight approximation with reduced label complexity (number of labels revealed). We reduce label complexity by $\Omega(\sqrt{n})$. Open question: Can label complexity be reduced by $\Omega(n)$ with tight $(1+d/n)$-approximation?

Related papers

The Space Complexity of Approximating Logistic Loss [11.338399194998933]
We prove a general $tildeOmega(dcdot mu_mathbfy(mathbfX))$ space lower bound when $epsilon$ is constant. We also refute a prior conjecture that $mu_mathbfy(mathbfX)$ is hard to compute.
arXiv Detail & Related papers (2024-12-03T18:11:37Z)
Learning a Single Neuron Robustly to Distributional Shifts and Adversarial Label Noise [38.551072383777594]
We study the problem of learning a single neuron with respect to the $L2$ loss in the presence of adversarial distribution shifts. A new algorithm is developed to approximate the vector vector squared loss with respect to the worst distribution that is in the $chi2$divergence to the $mathcalp_0$.
arXiv Detail & Related papers (2024-11-11T03:43:52Z)
The Communication Complexity of Approximating Matrix Rank [50.6867896228563]
We show that this problem has randomized communication complexity $Omega(frac1kcdot n2log|mathbbF|)$. As an application, we obtain an $Omega(frac1kcdot n2log|mathbbF|)$ space lower bound for any streaming algorithm with $k$ passes.
arXiv Detail & Related papers (2024-10-26T06:21:42Z)
In-depth Analysis of Low-rank Matrix Factorisation in a Federated Setting [21.002519159190538]
We analyze a distributed algorithm to compute a low-rank matrix factorization on $N$ clients. We obtain a global $mathbfV$ in $mathbbRd times r$ common to all clients and a local $mathbfUi$ in $mathbbRn_itimes r$.
arXiv Detail & Related papers (2024-09-13T12:28:42Z)
Efficient Continual Finite-Sum Minimization [52.5238287567572]
We propose a key twist into the finite-sum minimization, dubbed as continual finite-sum minimization. Our approach significantly improves upon the $mathcalO(n/epsilon)$ FOs that $mathrmStochasticGradientDescent$ requires. We also prove that there is no natural first-order method with $mathcalOleft(n/epsilonalpharight)$ complexity gradient for $alpha 1/4$, establishing that the first-order complexity of our method is nearly tight.
arXiv Detail & Related papers (2024-06-07T08:26:31Z)
Sample-Efficient Linear Regression with Self-Selection Bias [7.605563562103568]
We consider the problem of linear regression with self-selection bias in the unknown-index setting. We provide a novel and near optimally sample-efficient (in terms of $k$) algorithm to recover $mathbfw_1,ldots,mathbfw_kin. Our algorithm succeeds under significantly relaxed noise assumptions, and therefore also succeeds in the related setting of max-linear regression.
arXiv Detail & Related papers (2024-02-22T02:20:24Z)
Optimal Estimator for Linear Regression with Shuffled Labels [17.99906229036223]
This paper considers the task of linear regression with shuffled labels. $mathbf Y in mathbb Rntimes m, mathbf Pi in mathbb Rntimes p, mathbf B in mathbb Rptimes m$, and $mathbf Win mathbb Rntimes m$, respectively.
arXiv Detail & Related papers (2023-10-02T16:44:47Z)
$\ell_p$-Regression in the Arbitrary Partition Model of Communication [59.89387020011663]
We consider the randomized communication complexity of the distributed $ell_p$-regression problem in the coordinator model. For $p = 2$, i.e., least squares regression, we give the first optimal bound of $tildeTheta(sd2 + sd/epsilon)$ bits. For $p in (1,2)$,we obtain an $tildeO(sd2/epsilon + sd/mathrmpoly(epsilon)$ upper bound.
arXiv Detail & Related papers (2023-07-11T08:51:53Z)
Learning a Single Neuron with Adversarial Label Noise via Gradient Descent [50.659479930171585]
We study a function of the form $mathbfxmapstosigma(mathbfwcdotmathbfx)$ for monotone activations. The goal of the learner is to output a hypothesis vector $mathbfw$ that $F(mathbbw)=C, epsilon$ with high probability.
arXiv Detail & Related papers (2022-06-17T17:55:43Z)
Threshold Phenomena in Learning Halfspaces with Massart Noise [56.01192577666607]
We study the problem of PAC learning halfspaces on $mathbbRd$ with Massart noise under Gaussian marginals. Our results qualitatively characterize the complexity of learning halfspaces in the Massart model.
arXiv Detail & Related papers (2021-08-19T16:16:48Z)
The planted matching problem: Sharp threshold and infinite-order phase transition [25.41713098167692]
We study the problem of reconstructing a perfect matching $M*$ hidden in a randomly weighted $ntimes n$ bipartite graph. We show that if $sqrtd B(mathcalP,mathcalQ) ge 1+epsilon$ for an arbitrarily small constant $epsilon>0$, the reconstruction error for any estimator is shown to be bounded away from $0$.
arXiv Detail & Related papers (2021-03-17T00:59:33Z)
Linear Bandits on Uniformly Convex Sets [88.3673525964507]
Linear bandit algorithms yield $tildemathcalO(nsqrtT)$ pseudo-regret bounds on compact convex action sets. Two types of structural assumptions lead to better pseudo-regret bounds.
arXiv Detail & Related papers (2021-03-10T07:33:03Z)
On the Complexity of Minimizing Convex Finite Sums Without Using the Indices of the Individual Functions [62.01594253618911]
We exploit the finite noise structure of finite sums to derive a matching $O(n2)$-upper bound under the global oracle model. Following a similar approach, we propose a novel adaptation of SVRG which is both emphcompatible with oracles, and achieves complexity bounds of $tildeO(n2+nsqrtL/mu)log (1/epsilon)$ and $O(nsqrtL/epsilon)$, for $mu>0$ and $mu=0$
arXiv Detail & Related papers (2020-02-09T03:39:46Z)

This list is automatically generated from the titles and abstracts of the papers in this site.