Surprises in adversarially-trained linear regression
- URL: http://arxiv.org/abs/2205.12695v1
- Date: Wed, 25 May 2022 11:54:42 GMT
- Title: Surprises in adversarially-trained linear regression
- Authors: Ant\^onio H. Ribeiro and Dave Zachariah and Thomas B. Sch\"on
- Abstract summary: Adversarial training is one of the most effective approaches to defend against such examples.
We show that for linear regression problems, adversarial training can be formulated as a convex problem.
We show that for sufficiently many features or sufficiently small regularization parameters, the learned model perfectly interpolates the training data.
- Score: 12.33259114006129
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: State-of-the-art machine learning models can be vulnerable to very small
input perturbations that are adversarially constructed. Adversarial training is
one of the most effective approaches to defend against such examples. We show
that for linear regression problems, adversarial training can be formulated as
a convex problem. This fact is then used to show that $\ell_\infty$-adversarial
training produces sparse solutions and has many similarities to the lasso
method. Similarly, $\ell_2$-adversarial training has similarities with ridge
regression. We use a robust regression framework to analyze and understand
these similarities and also point to some differences. Finally, we show how
adversarial training behaves differently from other regularization methods when
estimating overparameterized models (i.e., models with more parameters than
datapoints). It minimizes a sum of three terms which regularizes the solution,
but unlike lasso and ridge regression, it can sharply transition into an
interpolation mode. We show that for sufficiently many features or sufficiently
small regularization parameters, the learned model perfectly interpolates the
training data while still exhibiting good out-of-sample performance.
Related papers
- Robust Capped lp-Norm Support Vector Ordinal Regression [85.84718111830752]
Ordinal regression is a specialized supervised problem where the labels show an inherent order.
Support Vector Ordinal Regression, as an outstanding ordinal regression model, is widely used in many ordinal regression tasks.
We introduce a new model, Capped $ell_p$-Norm Support Vector Ordinal Regression(CSVOR), that is robust to outliers.
arXiv Detail & Related papers (2024-04-25T13:56:05Z) - Rethinking Classifier Re-Training in Long-Tailed Recognition: A Simple
Logits Retargeting Approach [102.0769560460338]
We develop a simple logits approach (LORT) without the requirement of prior knowledge of the number of samples per class.
Our method achieves state-of-the-art performance on various imbalanced datasets, including CIFAR100-LT, ImageNet-LT, and iNaturalist 2018.
arXiv Detail & Related papers (2024-03-01T03:27:08Z) - Regularization properties of adversarially-trained linear regression [5.7077257711082785]
State-of-the-art machine learning models can be vulnerable to very small input perturbations.
Adversarial training is an effective approach to defend against it.
arXiv Detail & Related papers (2023-10-16T20:09:58Z) - Engression: Extrapolation through the Lens of Distributional Regression [2.519266955671697]
We propose a neural network-based distributional regression methodology called engression'
An engression model is generative in the sense that we can sample from the fitted conditional distribution and is also suitable for high-dimensional outcomes.
We show that engression can successfully perform extrapolation under some assumptions such as monotonicity, whereas traditional regression approaches such as least-squares or quantile regression fall short under the same assumptions.
arXiv Detail & Related papers (2023-07-03T08:19:00Z) - Analysis of Interpolating Regression Models and the Double Descent
Phenomenon [3.883460584034765]
It is commonly assumed that models which interpolate noisy training data are poor to generalize.
The best models obtained are overparametrized and the testing error exhibits the double descent behavior as the model order increases.
We derive a result based on the behavior of the smallest singular value of the regression matrix that explains the peak location and the double descent shape of the testing error as a function of model order.
arXiv Detail & Related papers (2023-04-17T09:44:33Z) - Theoretical Characterization of the Generalization Performance of
Overfitted Meta-Learning [70.52689048213398]
This paper studies the performance of overfitted meta-learning under a linear regression model with Gaussian features.
We find new and interesting properties that do not exist in single-task linear regression.
Our analysis suggests that benign overfitting is more significant and easier to observe when the noise and the diversity/fluctuation of the ground truth of each training task are large.
arXiv Detail & Related papers (2023-04-09T20:36:13Z) - Smoothly Giving up: Robustness for Simple Models [30.56684535186692]
Examples of algorithms to train such models include logistic regression and boosting.
We use $Served-Served joint convex loss functions, which tunes between canonical convex loss functions, to robustly train such models.
We also provide results for boosting a COVID-19 dataset for logistic regression, highlighting the efficacy approach across multiple relevant domains.
arXiv Detail & Related papers (2023-02-17T19:48:11Z) - Least-Squares Linear Dilation-Erosion Regressor Trained using Stochastic
Descent Gradient or the Difference of Convex Methods [2.055949720959582]
We present a hybrid morphological neural network for regression tasks called linear dilation-erosion regression ($ell$-DER)
An $ell$-DER model is given by a convex combination of the composition of linear and morphological elementary operators.
arXiv Detail & Related papers (2021-07-12T18:41:59Z) - Regression Bugs Are In Your Model! Measuring, Reducing and Analyzing
Regressions In NLP Model Updates [68.09049111171862]
This work focuses on quantifying, reducing and analyzing regression errors in the NLP model updates.
We formulate the regression-free model updates into a constrained optimization problem.
We empirically analyze how model ensemble reduces regression.
arXiv Detail & Related papers (2021-05-07T03:33:00Z) - Positive-Congruent Training: Towards Regression-Free Model Updates [87.25247195148187]
In image classification, sample-wise inconsistencies appear as "negative flips"
A new model incorrectly predicts the output for a test sample that was correctly classified by the old (reference) model.
We propose a simple approach for PC training, Focal Distillation, which enforces congruence with the reference model.
arXiv Detail & Related papers (2020-11-18T09:00:44Z) - Variational Bayesian Unlearning [54.26984662139516]
We study the problem of approximately unlearning a Bayesian model from a small subset of the training data to be erased.
We show that it is equivalent to minimizing an evidence upper bound which trades off between fully unlearning from erased data vs. not entirely forgetting the posterior belief.
In model training with VI, only an approximate (instead of exact) posterior belief given the full data can be obtained, which makes unlearning even more challenging.
arXiv Detail & Related papers (2020-10-24T11:53:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.