Distribution-Independent Regression for Generalized Linear Models with
Oblivious Corruptions
- URL: http://arxiv.org/abs/2309.11657v2
- Date: Wed, 27 Sep 2023 20:57:00 GMT
- Title: Distribution-Independent Regression for Generalized Linear Models with
Oblivious Corruptions
- Authors: Ilias Diakonikolas, Sushrut Karmalkar, Jongho Park, Christos Tzamos
- Abstract summary: We show the first algorithms for the problem of regression for generalized linear models (GLMs) in the presence of additive oblivious noise.
We present an algorithm that tackles newthis problem in its most general distribution-independent setting.
This is the first newalgorithmic result for GLM regression newwith oblivious noise which can handle more than half the samples being arbitrarily corrupted.
- Score: 49.69852011882769
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We demonstrate the first algorithms for the problem of regression for
generalized linear models (GLMs) in the presence of additive oblivious noise.
We assume we have sample access to examples $(x, y)$ where $y$ is a noisy
measurement of $g(w^* \cdot x)$. In particular, \new{the noisy labels are of
the form} $y = g(w^* \cdot x) + \xi + \epsilon$, where $\xi$ is the oblivious
noise drawn independently of $x$ \new{and satisfies} $\Pr[\xi = 0] \geq o(1)$,
and $\epsilon \sim \mathcal N(0, \sigma^2)$. Our goal is to accurately recover
a \new{parameter vector $w$ such that the} function $g(w \cdot x)$ \new{has}
arbitrarily small error when compared to the true values $g(w^* \cdot x)$,
rather than the noisy measurements $y$.
We present an algorithm that tackles \new{this} problem in its most general
distribution-independent setting, where the solution may not \new{even} be
identifiable. \new{Our} algorithm returns \new{an accurate estimate of} the
solution if it is identifiable, and otherwise returns a small list of
candidates, one of which is close to the true solution. Furthermore, we
\new{provide} a necessary and sufficient condition for identifiability, which
holds in broad settings. \new{Specifically,} the problem is identifiable when
the quantile at which $\xi + \epsilon = 0$ is known, or when the family of
hypotheses does not contain candidates that are nearly equal to a translated
$g(w^* \cdot x) + A$ for some real number $A$, while also having large error
when compared to $g(w^* \cdot x)$.
This is the first \new{algorithmic} result for GLM regression \new{with
oblivious noise} which can handle more than half the samples being arbitrarily
corrupted. Prior work focused largely on the setting of linear regression, and
gave algorithms under restrictive assumptions.
Related papers
- Iterative thresholding for non-linear learning in the strong $\varepsilon$-contamination model [3.309767076331365]
We derive approximation bounds for learning single neuron models using thresholded descent.
We also study the linear regression problem, where $sigma(mathbfx) = mathbfx$.
arXiv Detail & Related papers (2024-09-05T16:59:56Z) - Sample-Efficient Linear Regression with Self-Selection Bias [7.605563562103568]
We consider the problem of linear regression with self-selection bias in the unknown-index setting.
We provide a novel and near optimally sample-efficient (in terms of $k$) algorithm to recover $mathbfw_1,ldots,mathbfw_kin.
Our algorithm succeeds under significantly relaxed noise assumptions, and therefore also succeeds in the related setting of max-linear regression.
arXiv Detail & Related papers (2024-02-22T02:20:24Z) - A Unified Framework for Uniform Signal Recovery in Nonlinear Generative
Compressed Sensing [68.80803866919123]
Under nonlinear measurements, most prior results are non-uniform, i.e., they hold with high probability for a fixed $mathbfx*$ rather than for all $mathbfx*$ simultaneously.
Our framework accommodates GCS with 1-bit/uniformly quantized observations and single index models as canonical examples.
We also develop a concentration inequality that produces tighter bounds for product processes whose index sets have low metric entropy.
arXiv Detail & Related papers (2023-09-25T17:54:19Z) - A spectral least-squares-type method for heavy-tailed corrupted
regression with unknown covariance \& heterogeneous noise [2.019622939313173]
We revisit heavy-tailed corrupted least-squares linear regression assuming to have a corrupted $n$-sized label-feature sample of at most $epsilon n$ arbitrary outliers.
We propose a near-optimal computationally tractable estimator, based on the power method, assuming no knowledge on $(Sigma,Xi) nor the operator norm of $Xi$.
arXiv Detail & Related papers (2022-09-06T23:37:31Z) - Robust Testing in High-Dimensional Sparse Models [0.0]
We consider the problem of robustly testing the norm of a high-dimensional sparse signal vector under two different observation models.
We show that any algorithm that reliably tests the norm of the regression coefficient requires at least $n=Omegaleft(min(slog d,1/gamma4)right) samples.
arXiv Detail & Related papers (2022-05-16T07:47:22Z) - Sketching Algorithms and Lower Bounds for Ridge Regression [65.0720777731368]
We give a sketching-based iterative algorithm that computes $1+varepsilon$ approximate solutions for the ridge regression problem.
We also show that this algorithm can be used to give faster algorithms for kernel ridge regression.
arXiv Detail & Related papers (2022-04-13T22:18:47Z) - Active Sampling for Linear Regression Beyond the $\ell_2$ Norm [70.49273459706546]
We study active sampling algorithms for linear regression, which aim to query only a small number of entries of a target vector.
We show that this dependence on $d$ is optimal, up to logarithmic factors.
We also provide the first total sensitivity upper bound $O(dmax1,p/2log2 n)$ for loss functions with at most degree $p$ growth.
arXiv Detail & Related papers (2021-11-09T00:20:01Z) - Optimal Regret Algorithm for Pseudo-1d Bandit Convex Optimization [51.23789922123412]
We study online learning with bandit feedback (i.e. learner has access to only zeroth-order oracle) where cost/reward functions admit a "pseudo-1d" structure.
We show a lower bound of $min(sqrtdT, T3/4)$ for the regret of any algorithm, where $T$ is the number of rounds.
We propose a new algorithm sbcalg that combines randomized online gradient descent with a kernelized exponential weights method to exploit the pseudo-1d structure effectively.
arXiv Detail & Related papers (2021-02-15T08:16:51Z) - Efficient Statistics for Sparse Graphical Models from Truncated Samples [19.205541380535397]
We focus on two fundamental and classical problems: (i) inference of sparse Gaussian graphical models and (ii) support recovery of sparse linear models.
For sparse linear regression, suppose samples $(bf x,y)$ are generated where $y = bf xtopOmega* + mathcalN(0,1)$ and $(bf x, y)$ is seen only if $y$ belongs to a truncation set $S subseteq mathbbRd$.
arXiv Detail & Related papers (2020-06-17T09:21:00Z) - Model-Free Reinforcement Learning: from Clipped Pseudo-Regret to Sample
Complexity [59.34067736545355]
Given an MDP with $S$ states, $A$ actions, the discount factor $gamma in (0,1)$, and an approximation threshold $epsilon > 0$, we provide a model-free algorithm to learn an $epsilon$-optimal policy.
For small enough $epsilon$, we show an improved algorithm with sample complexity.
arXiv Detail & Related papers (2020-06-06T13:34:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.