Related papers: On the (Non-)Robustness of Two-Layer Neural Networks in Different Learning Regimes

On the (Non-)Robustness of Two-Layer Neural Networks in Different Learning Regimes

URL: http://arxiv.org/abs/2203.11864v1
Date: Tue, 22 Mar 2022 16:40:52 GMT
Title: On the (Non-)Robustness of Two-Layer Neural Networks in Different Learning Regimes
Authors: Elvis Dohmatob, Alberto Bietti
Abstract summary: Neural networks are highly sensitive to adversarial examples. We study robustness and generalization in different scenarios. We show how linearized lazy training regimes can worsen robustness.
Score: 27.156666384752548
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Neural networks are known to be highly sensitive to adversarial examples. These may arise due to different factors, such as random initialization, or spurious correlations in the learning problem. To better understand these factors, we provide a precise study of robustness and generalization in different scenarios, from initialization to the end of training in different regimes, as well as intermediate scenarios, where initialization still plays a role due to "lazy" training. We consider over-parameterized networks in high dimensions with quadratic targets and infinite samples. Our analysis allows us to identify new trade-offs between generalization and robustness, whereby robustness can only get worse when generalization improves, and vice versa. We also show how linearized lazy training regimes can worsen robustness, due to improperly scaled random initialization. Our theoretical results are illustrated with numerical experiments.

Related papers

How connectivity structure shapes rich and lazy learning in neural circuits [14.236853424595333]
We investigate how the structure of the initial weights -- in particular their effective rank -- influences the network learning regime. Our research highlights the pivotal role of initial weight structures in shaping learning regimes.
arXiv Detail & Related papers (2023-10-12T17:08:45Z)
Deep Neural Networks Tend To Extrapolate Predictably [51.303814412294514]
neural network predictions tend to be unpredictable and overconfident when faced with out-of-distribution (OOD) inputs. We observe that neural network predictions often tend towards a constant value as input data becomes increasingly OOD. We show how one can leverage our insights in practice to enable risk-sensitive decision-making in the presence of OOD inputs.
arXiv Detail & Related papers (2023-10-02T03:25:32Z)
Global quantitative robustness of regression feed-forward neural networks [0.0]
We adapt the notion of the regression breakdown point to regression neural networks. We compare the performance, measured by the out-of-sample loss, by a proxy of the breakdown rate. The results indeed motivate to use robust loss functions for neural network training.
arXiv Detail & Related papers (2022-11-18T09:57:53Z)
Learning Curves for Sequential Training of Neural Networks: Self-Knowledge Transfer and Forgetting [9.734033555407406]
We consider neural networks in the neural tangent kernel regime that continually learn target functions from task to task. We investigate a variant of continual learning where the model learns the same target function in multiple tasks. Even for the same target, the trained model shows some transfer and forgetting depending on the sample size of each task.
arXiv Detail & Related papers (2021-12-03T00:25:01Z)
Subquadratic Overparameterization for Shallow Neural Networks [60.721751363271146]
We provide an analytical framework that allows us to adopt standard neural training strategies. We achieve the desiderata viaak-Lojasiewicz, smoothness, and standard assumptions.
arXiv Detail & Related papers (2021-11-02T20:24:01Z)
Training Integrable Parameterizations of Deep Neural Networks in the Infinite-Width Limit [0.0]
Large-width dynamics has emerged as a fruitful viewpoint and led to practical insights on real-world deep networks. For two-layer neural networks, it has been understood that the nature of the trained model radically changes depending on the scale of the initial random weights. We propose various methods to avoid this trivial behavior and analyze in detail the resulting dynamics.
arXiv Detail & Related papers (2021-10-29T07:53:35Z)
What training reveals about neural network complexity [80.87515604428346]
This work explores the hypothesis that the complexity of the function a deep neural network (NN) is learning can be deduced by how fast its weights change during training. Our results support the hypothesis that good training behavior can be a useful bias towards good generalization.
arXiv Detail & Related papers (2021-06-08T08:58:00Z)
Towards an Understanding of Benign Overfitting in Neural Networks [104.2956323934544]
Modern machine learning models often employ a huge number of parameters and are typically optimized to have zero training loss. We examine how these benign overfitting phenomena occur in a two-layer neural network setting. We show that it is possible for the two-layer ReLU network interpolator to achieve a near minimax-optimal learning rate.
arXiv Detail & Related papers (2021-06-06T19:08:53Z)
Vulnerability Under Adversarial Machine Learning: Bias or Variance? [77.30759061082085]
We investigate the effect of adversarial machine learning on the bias and variance of a trained deep neural network. Our analysis sheds light on why the deep neural networks have poor performance under adversarial perturbation. We introduce a new adversarial machine learning algorithm with lower computational complexity than well-known adversarial machine learning strategies.
arXiv Detail & Related papers (2020-08-01T00:58:54Z)
Learning from Failure: Training Debiased Classifier from Biased Classifier [76.52804102765931]
We show that neural networks learn to rely on spurious correlation only when it is "easier" to learn than the desired knowledge. We propose a failure-based debiasing scheme by training a pair of neural networks simultaneously. Our method significantly improves the training of the network against various types of biases in both synthetic and real-world datasets.
arXiv Detail & Related papers (2020-07-06T07:20:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.