Beyond the Universal Law of Robustness: Sharper Laws for Random Features
and Neural Tangent Kernels
- URL: http://arxiv.org/abs/2302.01629v2
- Date: Sat, 27 May 2023 07:24:49 GMT
- Title: Beyond the Universal Law of Robustness: Sharper Laws for Random Features
and Neural Tangent Kernels
- Authors: Simone Bombari, Shayan Kiyani, Marco Mondelli
- Abstract summary: This paper focuses on empirical risk minimization in two settings, namely, random features and the neural tangent kernel (NTK)
We prove that, for random features, the model is not robust for any degree of over- parameterization, even when the necessary condition coming from the universal law of robustness is satisfied.
Our results are corroborated by numerical evidence on both synthetic and standard prototypical datasets.
- Score: 14.186776881154127
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Machine learning models are vulnerable to adversarial perturbations, and a
thought-provoking paper by Bubeck and Sellke has analyzed this phenomenon
through the lens of over-parameterization: interpolating smoothly the data
requires significantly more parameters than simply memorizing it. However, this
"universal" law provides only a necessary condition for robustness, and it is
unable to discriminate between models. In this paper, we address these gaps by
focusing on empirical risk minimization in two prototypical settings, namely,
random features and the neural tangent kernel (NTK). We prove that, for random
features, the model is not robust for any degree of over-parameterization, even
when the necessary condition coming from the universal law of robustness is
satisfied. In contrast, for even activations, the NTK model meets the universal
lower bound, and it is robust as soon as the necessary condition on
over-parameterization is fulfilled. This also addresses a conjecture in prior
work by Bubeck, Li and Nagaraj. Our analysis decouples the effect of the kernel
of the model from an "interaction matrix", which describes the interaction with
the test data and captures the effect of the activation. Our theoretical
results are corroborated by numerical evidence on both synthetic and standard
datasets (MNIST, CIFAR-10).
Related papers
- SPIN: SE(3)-Invariant Physics Informed Network for Binding Affinity Prediction [3.406882192023597]
Accurate prediction of protein-ligand binding affinity is crucial for drug development.
Traditional methods often fail to accurately model the complex's spatial information.
We propose SPIN, a model that incorporates various inductive biases applicable to this task.
arXiv Detail & Related papers (2024-07-10T08:40:07Z) - Learning a Sparse Neural Network using IHT [1.124958340749622]
This paper relies on results from the domain of advanced sparse optimization, particularly those addressing nonlinear differentiable functions.
As computational power for training NNs increases, so does the complexity of the models in terms of a higher number of parameters.
This paper aims to investigate whether the theoretical prerequisites for such convergence are applicable in the realm of neural network (NN) training.
arXiv Detail & Related papers (2024-04-29T04:10:22Z) - The Risk of Federated Learning to Skew Fine-Tuning Features and
Underperform Out-of-Distribution Robustness [50.52507648690234]
Federated learning has the risk of skewing fine-tuning features and compromising the robustness of the model.
We introduce three robustness indicators and conduct experiments across diverse robust datasets.
Our approach markedly enhances the robustness across diverse scenarios, encompassing various parameter-efficient fine-tuning methods.
arXiv Detail & Related papers (2024-01-25T09:18:51Z) - The Surprising Harmfulness of Benign Overfitting for Adversarial
Robustness [13.120373493503772]
We prove a surprising result that even if the ground truth itself is robust to adversarial examples, the benignly overfitted model is benign in terms of the standard'' out-of-sample risk objective.
Our finding provides theoretical insights into the puzzling phenomenon observed in practice, where the true target function (e.g., human) is robust against adverasrial attack, while beginly overfitted neural networks lead to models that are not robust.
arXiv Detail & Related papers (2024-01-19T15:40:46Z) - A PAC-Bayesian Perspective on the Interpolating Information Criterion [54.548058449535155]
We show how a PAC-Bayes bound is obtained for a general class of models, characterizing factors which influence performance in the interpolating regime.
We quantify how the test error for overparameterized models achieving effectively zero training error depends on the quality of the implicit regularization imposed by e.g. the combination of model, parameter-initialization scheme.
arXiv Detail & Related papers (2023-11-13T01:48:08Z) - Bias in Pruned Vision Models: In-Depth Analysis and Countermeasures [93.17009514112702]
Pruning, setting a significant subset of the parameters of a neural network to zero, is one of the most popular methods of model compression.
Despite existing evidence for this phenomenon, the relationship between neural network pruning and induced bias is not well-understood.
arXiv Detail & Related papers (2023-04-25T07:42:06Z) - Capturing dynamical correlations using implicit neural representations [85.66456606776552]
We develop an artificial intelligence framework which combines a neural network trained to mimic simulated data from a model Hamiltonian with automatic differentiation to recover unknown parameters from experimental data.
In doing so, we illustrate the ability to build and train a differentiable model only once, which then can be applied in real-time to multi-dimensional scattering data.
arXiv Detail & Related papers (2023-04-08T07:55:36Z) - Neural Abstractions [72.42530499990028]
We present a novel method for the safety verification of nonlinear dynamical models that uses neural networks to represent abstractions of their dynamics.
We demonstrate that our approach performs comparably to the mature tool Flow* on existing benchmark nonlinear models.
arXiv Detail & Related papers (2023-01-27T12:38:09Z) - Robust Neural Posterior Estimation and Statistical Model Criticism [1.5749416770494706]
We argue that modellers must treat simulators as idealistic representations of the true data generating process.
In this work we revisit neural posterior estimation (NPE), a class of algorithms that enable black-box parameter inference in simulation models.
We find that the presence of misspecification, in contrast, leads to unreliable inference when NPE is used naively.
arXiv Detail & Related papers (2022-10-12T20:06:55Z) - Causally Estimating the Sensitivity of Neural NLP Models to Spurious
Features [19.770032728328733]
There is no measure to evaluate or compare the effects of different forms of spurious features in NLP.
We quantify model sensitivity to spurious features with a causal estimand, dubbed CENT.
We find statistically significant inverse correlations between sensitivity and robustness, providing empirical support for our hypothesis.
arXiv Detail & Related papers (2021-10-14T05:26:08Z) - Non-Singular Adversarial Robustness of Neural Networks [58.731070632586594]
Adrial robustness has become an emerging challenge for neural network owing to its over-sensitivity to small input perturbations.
We formalize the notion of non-singular adversarial robustness for neural networks through the lens of joint perturbations to data inputs as well as model weights.
arXiv Detail & Related papers (2021-02-23T20:59:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.