Approximation Schemes for ReLU Regression
- URL: http://arxiv.org/abs/2005.12844v2
- Date: Mon, 28 Sep 2020 18:08:38 GMT
- Title: Approximation Schemes for ReLU Regression
- Authors: Ilias Diakonikolas, Surbhi Goel, Sushrut Karmalkar, Adam R. Klivans,
Mahdi Soltanolkotabi
- Abstract summary: We consider the fundamental problem of ReLU regression.
The goal is to output the best fitting ReLU with respect to square loss given to draws from some unknown distribution.
- Score: 80.33702497406632
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider the fundamental problem of ReLU regression, where the goal is to
output the best fitting ReLU with respect to square loss given access to draws
from some unknown distribution. We give the first efficient, constant-factor
approximation algorithm for this problem assuming the underlying distribution
satisfies some weak concentration and anti-concentration conditions (and
includes, for example, all log-concave distributions). This solves the main
open problem of Goel et al., who proved hardness results for any exact
algorithm for ReLU regression (up to an additive $\epsilon$). Using more
sophisticated techniques, we can improve our results and obtain a
polynomial-time approximation scheme for any subgaussian distribution. Given
the aforementioned hardness results, these guarantees can not be substantially
improved.
Our main insight is a new characterization of surrogate losses for nonconvex
activations. While prior work had established the existence of convex
surrogates for monotone activations, we show that properties of the underlying
distribution actually induce strong convexity for the loss, allowing us to
relate the global minimum to the activation's Chow parameters.
Related papers
- Error Feedback under $(L_0,L_1)$-Smoothness: Normalization and Momentum [56.37522020675243]
We provide the first proof of convergence for normalized error feedback algorithms across a wide range of machine learning problems.
We show that due to their larger allowable stepsizes, our new normalized error feedback algorithms outperform their non-normalized counterparts on various tasks.
arXiv Detail & Related papers (2024-10-22T10:19:27Z) - A Pseudo-Semantic Loss for Autoregressive Models with Logical
Constraints [87.08677547257733]
Neuro-symbolic AI bridges the gap between purely symbolic and neural approaches to learning.
We show how to maximize the likelihood of a symbolic constraint w.r.t the neural network's output distribution.
We also evaluate our approach on Sudoku and shortest-path prediction cast as autoregressive generation.
arXiv Detail & Related papers (2023-12-06T20:58:07Z) - A Primal-Dual Approach to Solving Variational Inequalities with General Constraints [54.62996442406718]
Yang et al. (2023) recently showed how to use first-order gradient methods to solve general variational inequalities.
We prove the convergence of this method and show that the gap function of the last iterate of the method decreases at a rate of $O(frac1sqrtK)$ when the operator is $L$-Lipschitz and monotone.
arXiv Detail & Related papers (2022-10-27T17:59:09Z) - Agnostic Learning of General ReLU Activation Using Gradient Descent [38.28136172081834]
We consider the more challenging scenario when the bias of the ReLU function is non-zero.
A ReLU function that achieves an error that is within a constant factor of the optimal error of the best ReLU function with moderate bias is found.
arXiv Detail & Related papers (2022-08-04T15:14:05Z) - Domain-Adjusted Regression or: ERM May Already Learn Features Sufficient
for Out-of-Distribution Generalization [52.7137956951533]
We argue that devising simpler methods for learning predictors on existing features is a promising direction for future research.
We introduce Domain-Adjusted Regression (DARE), a convex objective for learning a linear predictor that is provably robust under a new model of distribution shift.
Under a natural model, we prove that the DARE solution is the minimax-optimal predictor for a constrained set of test distributions.
arXiv Detail & Related papers (2022-02-14T16:42:16Z) - Robust Estimation for Nonparametric Families via Generative Adversarial
Networks [92.64483100338724]
We provide a framework for designing Generative Adversarial Networks (GANs) to solve high dimensional robust statistics problems.
Our work extend these to robust mean estimation, second moment estimation, and robust linear regression.
In terms of techniques, our proposed GAN losses can be viewed as a smoothed and generalized Kolmogorov-Smirnov distance.
arXiv Detail & Related papers (2022-02-02T20:11:33Z) - Sparsest Univariate Learning Models Under Lipschitz Constraint [31.28451181040038]
We propose continuous-domain formulations for one-dimensional regression problems.
We control the Lipschitz constant explicitly using a user-defined upper-bound.
We show that both problems admit global minimizers that are continuous and piecewise-linear.
arXiv Detail & Related papers (2021-12-27T07:03:43Z) - Online and Distribution-Free Robustness: Regression and Contextual
Bandits with Huber Contamination [29.85468294601847]
We revisit two classic high-dimensional online learning problems, namely linear regression and contextual bandits.
We show that our algorithms succeed where conventional methods fail.
arXiv Detail & Related papers (2020-10-08T17:59:05Z) - A spectral algorithm for robust regression with subgaussian rates [0.0]
We study a new linear up to quadratic time algorithm for linear regression in the absence of strong assumptions on the underlying distributions of samples.
The goal is to design a procedure which attains the optimal sub-gaussian error bound even though the data have only finite moments.
arXiv Detail & Related papers (2020-07-12T19:33:50Z) - Cumulant GAN [17.4556035872983]
We propose a novel loss function for training Generative Adversarial Networks (GANs)
We show that the corresponding optimization problem is equivalent to R'enyi divergence minimization.
We experimentally demonstrate that image generation is more robust relative to Wasserstein GAN.
arXiv Detail & Related papers (2020-06-11T17:23:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.