Stochastic gradient descent for streaming linear and rectified linear
systems with Massart noise
- URL: http://arxiv.org/abs/2403.01204v1
- Date: Sat, 2 Mar 2024 12:45:01 GMT
- Title: Stochastic gradient descent for streaming linear and rectified linear
systems with Massart noise
- Authors: Halyun Jeong, Deanna Needell, Elizaveta Rebrova
- Abstract summary: We show novel nearly linear convergence guarantees of SGD-exp to the true parameter with up to $50%$ Massart corruption rate.
This is the first convergence guarantee result for robust ReLU regression in the streaming setting.
- Score: 9.841406613646813
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose SGD-exp, a stochastic gradient descent approach for linear and
ReLU regressions under Massart noise (adversarial semi-random corruption model)
for the fully streaming setting. We show novel nearly linear convergence
guarantees of SGD-exp to the true parameter with up to $50\%$ Massart
corruption rate, and with any corruption rate in the case of symmetric
oblivious corruptions. This is the first convergence guarantee result for
robust ReLU regression in the streaming setting, and it shows the improved
convergence rate over previous robust methods for $L_1$ linear regression due
to a choice of an exponentially decaying step size, known for its efficiency in
practice. Our analysis is based on the drift analysis of a discrete stochastic
process, which could also be interesting on its own.
Related papers
- Stochastic Gradient Descent for Gaussian Processes Done Right [86.83678041846971]
We show that when emphdone right -- by which we mean using specific insights from optimisation and kernel communities -- gradient descent is highly effective.
We introduce a emphstochastic dual descent algorithm, explain its design in an intuitive manner and illustrate the design choices.
Our method places Gaussian process regression on par with state-of-the-art graph neural networks for molecular binding affinity prediction.
arXiv Detail & Related papers (2023-10-31T16:15:13Z) - Max-affine regression via first-order methods [7.12511675782289]
The max-affine model ubiquitously arises in applications in signal processing and statistics.
We present a non-asymptotic convergence analysis of gradient descent (GD) and mini-batch gradient descent (SGD) for max-affine regression.
arXiv Detail & Related papers (2023-08-15T23:46:44Z) - Fast Robust Kernel Regression through Sign Gradient Descent with Early Stopping [1.5229257192293204]
Kernel ridge regression, KRR, is a generalization of linear ridge regression that is non-linear in the data, but linear in the model parameters.
We introduce an equivalent formulation of the objective function of KRR, which opens up both for replacing the ridge penalty with the $ell_infty$ and $ell_1$ penalties.
arXiv Detail & Related papers (2023-06-29T10:29:29Z) - Gradient Descent Converges Linearly for Logistic Regression on Separable
Data [17.60502131429094]
We show that running gradient descent with variable learning rate guarantees loss $f(x) leq 1.1 cdot f(x*) + epsilon$ the logistic regression objective.
We also apply our ideas to sparse logistic regression, where they lead to an exponential improvement of the sparsity-error tradeoff.
arXiv Detail & Related papers (2023-06-26T02:15:26Z) - Near Optimal Private and Robust Linear Regression [47.2888113094367]
We propose a variant of the popular differentially private gradient descent (DP-SGD) algorithm with two innovations.
Under label-corruption, this is the first efficient linear regression algorithm to guarantee both $(varepsilon,delta)$-DP and robustness.
arXiv Detail & Related papers (2023-01-30T20:33:26Z) - Optimal Online Generalized Linear Regression with Stochastic Noise and
Its Application to Heteroscedastic Bandits [88.6139446295537]
We study the problem of online generalized linear regression in the setting of a generalized linear model with possibly unbounded additive noise.
We provide a sharp analysis of the classical follow-the-regularized-leader (FTRL) algorithm to cope with the label noise.
We propose an algorithm based on FTRL to achieve the first variance-aware regret bound.
arXiv Detail & Related papers (2022-02-28T08:25:26Z) - On the Double Descent of Random Features Models Trained with SGD [78.0918823643911]
We study properties of random features (RF) regression in high dimensions optimized by gradient descent (SGD)
We derive precise non-asymptotic error bounds of RF regression under both constant and adaptive step-size SGD setting.
We observe the double descent phenomenon both theoretically and empirically.
arXiv Detail & Related papers (2021-10-13T17:47:39Z) - On the Convergence of Stochastic Extragradient for Bilinear Games with
Restarted Iteration Averaging [96.13485146617322]
We present an analysis of the ExtraGradient (SEG) method with constant step size, and present variations of the method that yield favorable convergence.
We prove that when augmented with averaging, SEG provably converges to the Nash equilibrium, and such a rate is provably accelerated by incorporating a scheduled restarting procedure.
arXiv Detail & Related papers (2021-06-30T17:51:36Z) - Robust Regression Revisited: Acceleration and Improved Estimation Rates [25.54653340884806]
We study fast algorithms for statistical regression problems under the strong contamination model.
The goal is to approximately optimize a generalized linear model (GLM) given adversarially corrupted samples.
We present nearly-linear time algorithms for robust regression problems with improved runtime or estimation guarantees.
arXiv Detail & Related papers (2021-06-22T17:21:56Z) - Stochastic Optimization with Heavy-Tailed Noise via Accelerated Gradient
Clipping [69.9674326582747]
We propose a new accelerated first-order method called clipped-SSTM for smooth convex optimization with heavy-tailed distributed noise in gradients.
We prove new complexity that outperform state-of-the-art results in this case.
We derive the first non-trivial high-probability complexity bounds for SGD with clipping without light-tails assumption on the noise.
arXiv Detail & Related papers (2020-05-21T17:05:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.