Related papers: Adversarial Learning Guarantees for Linear Hypotheses and Neural Networks

Adversarial Learning Guarantees for Linear Hypotheses and Neural Networks

URL: http://arxiv.org/abs/2004.13617v1
Date: Tue, 28 Apr 2020 15:55:16 GMT
Title: Adversarial Learning Guarantees for Linear Hypotheses and Neural Networks
Authors: Pranjal Awasthi, Natalie Frank, Mehryar Mohri
Abstract summary: We give upper and lower bounds for the adversarial empirical Rademacher complexity of linear hypotheses. We extend our analysis to provide Rademacher complexity lower and upper bounds for a single ReLU unit. Finally, we give adversarial Rademacher complexity bounds for feed-forward neural networks with one hidden layer.
Score: 45.06091849856641
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Adversarial or test time robustness measures the susceptibility of a classifier to perturbations to the test input. While there has been a flurry of recent work on designing defenses against such perturbations, the theory of adversarial robustness is not well understood. In order to make progress on this, we focus on the problem of understanding generalization in adversarial settings, via the lens of Rademacher complexity. We give upper and lower bounds for the adversarial empirical Rademacher complexity of linear hypotheses with adversarial perturbations measured in $l_r$-norm for an arbitrary $r \geq 1$. This generalizes the recent result of [Yin et al.'19] that studies the case of $r = \infty$, and provides a finer analysis of the dependence on the input dimensionality as compared to the recent work of [Khim and Loh'19] on linear hypothesis classes. We then extend our analysis to provide Rademacher complexity lower and upper bounds for a single ReLU unit. Finally, we give adversarial Rademacher complexity bounds for feed-forward neural networks with one hidden layer. Unlike previous works we directly provide bounds on the adversarial Rademacher complexity of the given network, as opposed to a bound on a surrogate. A by-product of our analysis also leads to tighter bounds for the Rademacher complexity of linear hypotheses, for which we give a detailed analysis and present a comparison with existing bounds.

Related papers

Do we really need the Rademacher complexities? [3.683202928838613]
We study the fundamental problem of learning with respect to the squared loss in a convex class. We prove that, contrary to prevailing belief and under minimal assumptions, the sample complexity is not governed by the Rademacher complexities.
arXiv Detail & Related papers (2025-02-21T00:40:22Z)
On the Geometry of Regularization in Adversarial Training: High-Dimensional Asymptotics and Generalization Bounds [11.30047438005394]
This work investigates the question of how to choose the regularization norm $lVert cdot rVert$ in the context of high-dimensional adversarial training for binary classification. We quantitatively characterize the relationship between perturbation size and the optimal choice of $lVert cdot rVert$, confirming the intuition that, in the data scarce regime, the type of regularization becomes increasingly important for adversarial training as perturbations grow in size.
arXiv Detail & Related papers (2024-10-21T14:53:12Z)
Stable Nonconvex-Nonconcave Training via Linear Interpolation [51.668052890249726]
This paper presents a theoretical analysis of linearahead as a principled method for stabilizing (large-scale) neural network training. We argue that instabilities in the optimization process are often caused by the nonmonotonicity of the loss landscape and show how linear can help by leveraging the theory of nonexpansive operators.
arXiv Detail & Related papers (2023-10-20T12:45:12Z)
Generalization Analysis for Contrastive Representation Learning [80.89690821916653]
Existing generalization error bounds depend linearly on the number $k$ of negative examples. We establish novel generalization bounds for contrastive learning which do not depend on $k$, up to logarithmic terms.
arXiv Detail & Related papers (2023-02-24T01:03:56Z)
On the Hardness of Robustness Transfer: A Perspective from Rademacher Complexity over Symmetric Difference Hypothesis Space [33.25614346461152]
We analyze a key complexity measure that controls the cross-domain generalization: the adversarial Rademacher complexity over em symmetric difference hypothesis space. For linear models, we show that adversarial version of this complexity is always greater than the non-adversarial one. Even though the robust domain adaptation is provably harder, we do find positive relation between robust learning and standard domain adaptation.
arXiv Detail & Related papers (2023-02-23T22:15:20Z)
Adversarial Rademacher Complexity of Deep Neural Networks [29.571059373990888]
A robust model shall perform well on both the perturbed training data and the unseen perturbed test data. We provide the first bound of adversarial Rademacher complexity of deep neural networks.
arXiv Detail & Related papers (2022-11-27T23:24:37Z)
Towards an Understanding of Benign Overfitting in Neural Networks [104.2956323934544]
Modern machine learning models often employ a huge number of parameters and are typically optimized to have zero training loss. We examine how these benign overfitting phenomena occur in a two-layer neural network setting. We show that it is possible for the two-layer ReLU network interpolator to achieve a near minimax-optimal learning rate.
arXiv Detail & Related papers (2021-06-06T19:08:53Z)
Online nonparametric regression with Sobolev kernels [99.12817345416846]
We derive the regret upper bounds on the classes of Sobolev spaces $W_pbeta(mathcalX)$, $pgeq 2, beta>fracdp$. The upper bounds are supported by the minimax regret analysis, which reveals that in the cases $beta> fracd2$ or $p=infty$ these rates are (essentially) optimal.
arXiv Detail & Related papers (2021-02-06T15:05:14Z)
On the Rademacher Complexity of Linear Hypothesis Sets [45.06091849856641]
We present a tight analysis of the empirical Rademacher complexity of the family of linear hypothesis classes with weight vectors bounded in $ell_p$-norm for any $p geq 1$. This provides a tight analysis of generalization using these hypothesis sets and helps derive sharp data-dependent learning guarantees.
arXiv Detail & Related papers (2020-07-21T19:08:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.