Related papers: Adversarial Examples in Multi-Layer Random ReLU Networks

Adversarial Examples in Multi-Layer Random ReLU Networks

URL: http://arxiv.org/abs/2106.12611v1
Date: Wed, 23 Jun 2021 18:16:34 GMT
Title: Adversarial Examples in Multi-Layer Random ReLU Networks
Authors: Peter L. Bartlett, S\'ebastien Bubeck and Yeshwanth Cherapanamjeri
Abstract summary: adversarial examples arise in ReLU networks with independent gaussian parameters. Bottleneck layers in the network play a key role: the minimal width up to some point determines scales and sensitivities of mappings computed up to that point.
Score: 39.797621513256026
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We consider the phenomenon of adversarial examples in ReLU networks with independent gaussian parameters. For networks of constant depth and with a large range of widths (for instance, it suffices if the width of each layer is polynomial in that of any other layer), small perturbations of input vectors lead to large changes of outputs. This generalizes results of Daniely and Schacham (2020) for networks of rapidly decreasing width and of Bubeck et al (2021) for two-layer networks. The proof shows that adversarial examples arise in these networks because the functions that they compute are very close to linear. Bottleneck layers in the network play a key role: the minimal width up to some point in the network determines scales and sensitivities of mappings computed up to that point. The main result is for networks with constant depth, but we also show that some constraint on depth is necessary for a result of this kind, because there are suitably deep networks that, with constant probability, compute a function that is close to constant.

Related papers

Asymptotics of Learning with Deep Structured (Random) Features [9.366617422860543]
For a large class of feature maps we provide a tight characterisation of the test error associated with learning the readout layer. In some cases our results can capture feature maps learned by deep, finite-width neural networks trained under gradient descent.
arXiv Detail & Related papers (2024-02-21T18:35:27Z)
Network Degeneracy as an Indicator of Training Performance: Comparing Finite and Infinite Width Angle Predictions [3.04585143845864]
We show that as networks get deeper and deeper, they are more susceptible to becoming degenerate. We use a simple algorithm that can accurately predict the level of degeneracy for any given fully connected ReLU network architecture.
arXiv Detail & Related papers (2023-06-02T13:02:52Z)
Feature-Learning Networks Are Consistent Across Widths At Realistic Scales [72.27228085606147]
We study the effect of width on the dynamics of feature-learning neural networks across a variety of architectures and datasets. Early in training, wide neural networks trained on online data have not only identical loss curves but also agree in their point-wise test predictions throughout training. We observe, however, that ensembles of narrower networks perform worse than a single wide network.
arXiv Detail & Related papers (2023-05-28T17:09:32Z)
Depth Degeneracy in Neural Networks: Vanishing Angles in Fully Connected ReLU Networks on Initialization [5.678271181959529]
We study the evolution of the angle between two inputs to a ReLU neural network as a function of the number of layers. We validate our theoretical results with Monte Carlo experiments and show that our results accurately approximate finite network behaviour. We also empirically investigate how the depth degeneracy phenomenon can negatively impact training of real networks.
arXiv Detail & Related papers (2023-02-20T01:30:27Z)
Bayesian Interpolation with Deep Linear Networks [92.1721532941863]
Characterizing how neural network depth, width, and dataset size jointly impact model quality is a central problem in deep learning theory. We show that linear networks make provably optimal predictions at infinite depth. We also show that with data-agnostic priors, Bayesian model evidence in wide linear networks is maximized at infinite depth.
arXiv Detail & Related papers (2022-12-29T20:57:46Z)
The Onset of Variance-Limited Behavior for Networks in the Lazy and Rich Regimes [75.59720049837459]
We study the transition from infinite-width behavior to this variance limited regime as a function of sample size $P$ and network width $N$. We find that finite-size effects can become relevant for very small datasets on the order of $P* sim sqrtN$ for regression with ReLU networks.
arXiv Detail & Related papers (2022-12-23T04:48:04Z)
On the Effective Number of Linear Regions in Shallow Univariate ReLU Networks: Convergence Guarantees and Implicit Bias [50.84569563188485]
We show that gradient flow converges in direction when labels are determined by the sign of a target network with $r$ neurons. Our result may already hold for mild over- parameterization, where the width is $tildemathcalO(r)$ and independent of the sample size.
arXiv Detail & Related papers (2022-05-18T16:57:10Z)
Width is Less Important than Depth in ReLU Neural Networks [40.83290846983707]
We show that any target network with inputs in $mathbbRd$ can be approximated by a width $O(d)$ network. We extend our results to constructing networks with bounded weights, and to constructing networks with width at most $d+2$.
arXiv Detail & Related papers (2022-02-08T13:07:22Z)
Doubly infinite residual neural networks: a diffusion process approach [8.642603456626393]
We show that deep ResNets do not suffer from undesirable forward-propagation properties. We focus on doubly infinite fully-connected ResNets, for which we consider i.i.d. Our results highlight a limited expressive power of doubly infinite ResNets when the unscaled network's parameters are i.i.d. and the residual blocks are shallow.
arXiv Detail & Related papers (2020-07-07T07:45:34Z)
Network Adjustment: Channel Search Guided by FLOPs Utilization Ratio [101.84651388520584]
This paper presents a new framework named network adjustment, which considers network accuracy as a function of FLOPs. Experiments on standard image classification datasets and a wide range of base networks demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2020-04-06T15:51:00Z)
Quasi-Equivalence of Width and Depth of Neural Networks [10.365556153676538]
We investigate if the design of artificial neural networks should have a directional preference. Inspired by the De Morgan law, we establish a quasi-equivalence between the width and depth of ReLU networks. Based on our findings, a deep network has a wide equivalent, subject to an arbitrarily small error.
arXiv Detail & Related papers (2020-02-06T21:17:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.