Related papers: Most Convolutional Networks Suffer from Small Adversarial Perturbations

Most Convolutional Networks Suffer from Small Adversarial Perturbations

URL: http://arxiv.org/abs/2602.03415v1
Date: Tue, 03 Feb 2026 11:42:55 GMT
Title: Most Convolutional Networks Suffer from Small Adversarial Perturbations
Authors: Amit Daniely, Idan Mehalel,
Abstract summary: Recent work establishes that adversarial examples can be found in CNNs, in some non-optimal distance from the input.<n>We prove that adversarial examples in random CNNs with input dimension $d$ can be found already in $ell$-distance of order $lVert x rVert /sqrtd$ from the input $x$.<n>We also show that such adversarial small perturbations can be found using a single step of gradient descent.
Score: 10.828616610785524
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The existence of adversarial examples is relatively understood for random fully connected neural networks, but much less so for convolutional neural networks (CNNs). The recent work [Daniely, 2025] establishes that adversarial examples can be found in CNNs, in some non-optimal distance from the input. We extend over this work and prove that adversarial examples in random CNNs with input dimension $d$ can be found already in $\ell_2$-distance of order $\lVert x \rVert /\sqrt{d}$ from the input $x$, which is essentially the nearest possible. We also show that such adversarial small perturbations can be found using a single step of gradient descent. To derive our results we use Fourier decomposition to efficiently bound the singular values of a random linear convolutional operator, which is the main ingredient of a CNN layer. This bound might be of independent interest.

Related papers

Neural Networks Learn Generic Multi-Index Models Near Information-Theoretic Limit [66.20349460098275]
We study the gradient descent learning of a general Gaussian Multi-index model $f(boldsymbolx)=g(boldsymbolUboldsymbolx)$ with hidden subspace $boldsymbolUin mathbbRrtimes d$.<n>We prove that under generic non-degenerate assumptions on the link function, a standard two-layer neural network trained via layer-wise gradient descent can agnostically learn the target with $o_d(1)$ test error.
arXiv Detail & Related papers (2025-11-19T04:46:47Z)
Bayesian Neural Networks: A Min-Max Game Framework [1.8032347672439046]
In deep learning, Bayesian neural networks (BNN) provide the role of robustness analysis.<n>We study a conservative BNN with the minimax method and formulate a two-player game between a deterministic neural network $f$ and a closed-loop neural network $f + rxi$.
arXiv Detail & Related papers (2023-11-18T17:17:15Z)
Theoretical Analysis of Inductive Biases in Deep Convolutional Networks [16.41952363194339]
We provide a theoretical analysis of the inductive biases in convolutional neural networks (CNNs) We compare the performance of CNNs, locally-connected networks (LCNs), and fully-connected networks (FCNs) on a simple regression task. We prove that LCNs require $Omega(d)$ samples while CNNs need only $widetildemathcalO(log2d)$ samples, highlighting the critical role of weight sharing.
arXiv Detail & Related papers (2023-05-15T07:40:07Z)
Generalization and Stability of Interpolating Neural Networks with Minimal Width [37.908159361149835]
We investigate the generalization and optimization of shallow neural-networks trained by gradient in the interpolating regime. We prove the training loss number minimizations $m=Omega(log4 (n))$ neurons and neurons $Tapprox n$. With $m=Omega(log4 (n))$ neurons and $Tapprox n$, we bound the test loss training by $tildeO (1/)$.
arXiv Detail & Related papers (2023-02-18T05:06:15Z)
The Onset of Variance-Limited Behavior for Networks in the Lazy and Rich Regimes [75.59720049837459]
We study the transition from infinite-width behavior to this variance limited regime as a function of sample size $P$ and network width $N$. We find that finite-size effects can become relevant for very small datasets on the order of $P* sim sqrtN$ for regression with ReLU networks.
arXiv Detail & Related papers (2022-12-23T04:48:04Z)
Bounding the Width of Neural Networks via Coupled Initialization -- A Worst Case Analysis [121.9821494461427]
We show how to significantly reduce the number of neurons required for two-layer ReLU networks. We also prove new lower bounds that improve upon prior work, and that under certain assumptions, are best possible.
arXiv Detail & Related papers (2022-06-26T06:51:31Z)
The Rate of Convergence of Variation-Constrained Deep Neural Networks [35.393855471751756]
We show that a class of variation-constrained neural networks can achieve near-parametric rate $n-1/2+delta$ for an arbitrarily small constant $delta$. The result indicates that the neural function space needed for approximating smooth functions may not be as large as what is often perceived.
arXiv Detail & Related papers (2021-06-22T21:28:00Z)
Beyond Lazy Training for Over-parameterized Tensor Decomposition [69.4699995828506]
We show that gradient descent on over-parametrized objective could go beyond the lazy training regime and utilize certain low-rank structure in the data. Our results show that gradient descent on over-parametrized objective could go beyond the lazy training regime and utilize certain low-rank structure in the data.
arXiv Detail & Related papers (2020-10-22T00:32:12Z)
Approximating smooth functions by deep neural networks with sigmoid activation function [0.0]
We study the power of deep neural networks (DNNs) with sigmoid activation function. We show that DNNs with fixed depth and a width of order $Md$ achieve an approximation rate of $M-2p$.
arXiv Detail & Related papers (2020-10-08T07:29:31Z)
Shuffling Recurrent Neural Networks [97.72614340294547]
We propose a novel recurrent neural network model, where the hidden state $h_t$ is obtained by permuting the vector elements of the previous hidden state $h_t-1$. In our model, the prediction is given by a second learned function, which is applied to the hidden state $s(h_t)$.
arXiv Detail & Related papers (2020-07-14T19:36:10Z)
Towards Understanding Hierarchical Learning: Benefits of Neural Representations [160.33479656108926]
In this work, we demonstrate that intermediate neural representations add more flexibility to neural networks. We show that neural representation can achieve improved sample complexities compared with the raw input. Our results characterize when neural representations are beneficial, and may provide a new perspective on why depth is important in deep learning.
arXiv Detail & Related papers (2020-06-24T02:44:54Z)
Approximation and Non-parametric Estimation of ResNet-type Convolutional Neural Networks [52.972605601174955]
We show a ResNet-type CNN can attain the minimax optimal error rates in important function classes. We derive approximation and estimation error rates of the aformentioned type of CNNs for the Barron and H"older classes.
arXiv Detail & Related papers (2019-03-24T19:42:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.