ReLU Regression with Massart Noise
- URL: http://arxiv.org/abs/2109.04623v1
- Date: Fri, 10 Sep 2021 02:13:22 GMT
- Title: ReLU Regression with Massart Noise
- Authors: Ilias Diakonikolas, Jongho Park, Christos Tzamos
- Abstract summary: We study the fundamental problem of ReLU regression, where the goal is to fit Rectified Linear Units (ReLUs) to data.
We focus on ReLU regression in the Massart noise model, a natural and well-studied semi-random noise model.
We develop an efficient algorithm that achieves exact parameter recovery in this model.
- Score: 52.10842036932169
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study the fundamental problem of ReLU regression, where the goal is to fit
Rectified Linear Units (ReLUs) to data. This supervised learning task is
efficiently solvable in the realizable setting, but is known to be
computationally hard with adversarial label noise. In this work, we focus on
ReLU regression in the Massart noise model, a natural and well-studied
semi-random noise model. In this model, the label of every point is generated
according to a function in the class, but an adversary is allowed to change
this value arbitrarily with some probability, which is {\em at most} $\eta <
1/2$. We develop an efficient algorithm that achieves exact parameter recovery
in this model under mild anti-concentration assumptions on the underlying
distribution. Such assumptions are necessary for exact recovery to be
information-theoretically possible. We demonstrate that our algorithm
significantly outperforms naive applications of $\ell_1$ and $\ell_2$
regression on both synthetic and real data.
Related papers
- Accelerated zero-order SGD under high-order smoothness and overparameterized regime [79.85163929026146]
We present a novel gradient-free algorithm to solve convex optimization problems.
Such problems are encountered in medicine, physics, and machine learning.
We provide convergence guarantees for the proposed algorithm under both types of noise.
arXiv Detail & Related papers (2024-11-21T10:26:17Z) - Towards Robust Model-Based Reinforcement Learning Against Adversarial Corruption [60.958746600254884]
This study tackles the challenges of adversarial corruption in model-based reinforcement learning (RL)
We introduce an algorithm called corruption-robust optimistic MLE (CR-OMLE), which leverages total-variation (TV)-based information ratios as uncertainty weights for MLE.
We extend our weighting technique to the offline setting, and propose an algorithm named corruption-robust pessimistic MLE (CR-PMLE)
arXiv Detail & Related papers (2024-02-14T07:27:30Z) - Generalized Oversampling for Learning from Imbalanced datasets and
Associated Theory [0.0]
In supervised learning, it is quite frequent to be confronted with real imbalanced datasets.
We propose a data augmentation procedure, the GOLIATH algorithm, based on kernel density estimates.
We evaluate the performance of the GOLIATH algorithm in imbalanced regression situations.
arXiv Detail & Related papers (2023-08-05T23:08:08Z) - Smoothly Giving up: Robustness for Simple Models [30.56684535186692]
Examples of algorithms to train such models include logistic regression and boosting.
We use $Served-Served joint convex loss functions, which tunes between canonical convex loss functions, to robustly train such models.
We also provide results for boosting a COVID-19 dataset for logistic regression, highlighting the efficacy approach across multiple relevant domains.
arXiv Detail & Related papers (2023-02-17T19:48:11Z) - Knockoffs-SPR: Clean Sample Selection in Learning with Noisy Labels [56.81761908354718]
We propose a novel theoretically guaranteed clean sample selection framework for learning with noisy labels.
Knockoffs-SPR can be regarded as a sample selection module for a standard supervised training pipeline.
We further combine it with a semi-supervised algorithm to exploit the support of noisy data as unlabeled data.
arXiv Detail & Related papers (2023-01-02T07:13:28Z) - Scalable Penalized Regression for Noise Detection in Learning with Noisy
Labels [44.79124350922491]
We propose using a theoretically guaranteed noisy label detection framework to detect and remove noisy data for Learning with Noisy Labels (LNL)
Specifically, we design a penalized regression to model the linear relation between network features and one-hot labels.
To make the framework scalable to datasets that contain a large number of categories and training data, we propose a split algorithm to divide the whole training set into small pieces.
arXiv Detail & Related papers (2022-03-15T11:09:58Z) - Risk Minimization from Adaptively Collected Data: Guarantees for
Supervised and Policy Learning [57.88785630755165]
Empirical risk minimization (ERM) is the workhorse of machine learning, but its model-agnostic guarantees can fail when we use adaptively collected data.
We study a generic importance sampling weighted ERM algorithm for using adaptively collected data to minimize the average of a loss function over a hypothesis class.
For policy learning, we provide rate-optimal regret guarantees that close an open gap in the existing literature whenever exploration decays to zero.
arXiv Detail & Related papers (2021-06-03T09:50:13Z) - Variational Bayesian Unlearning [54.26984662139516]
We study the problem of approximately unlearning a Bayesian model from a small subset of the training data to be erased.
We show that it is equivalent to minimizing an evidence upper bound which trades off between fully unlearning from erased data vs. not entirely forgetting the posterior belief.
In model training with VI, only an approximate (instead of exact) posterior belief given the full data can be obtained, which makes unlearning even more challenging.
arXiv Detail & Related papers (2020-10-24T11:53:00Z) - Regression via Implicit Models and Optimal Transport Cost Minimization [5.144809478361603]
Conditional GAN (CGAN) has been applied for regression.
Current CGAN implementation for regression uses the classical generator-discriminator architecture.
We propose a solution which directly optimize the optimal transport cost between the true probability distribution $p(y|x)$ and the estimated distribution $hatp(y|x)$.
arXiv Detail & Related papers (2020-03-03T02:26:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.