On the stability of deep convolutional neural networks under irregular
or random deformations
- URL: http://arxiv.org/abs/2104.11977v1
- Date: Sat, 24 Apr 2021 16:16:30 GMT
- Title: On the stability of deep convolutional neural networks under irregular
or random deformations
- Authors: Fabio Nicola and S. Ivan Trapasso
- Abstract summary: robustness under location deformations for deep convolutional neural networks (DCNNs) is of great theoretical and practical interest.
Here we address this issue for any field $tauin Linfty(mathbbRd;mathbbRd)$, without any additional regularity assumption.
We prove that for signals in multiresolution approximation spaces $U_s$ at scale $s$, stability in $|tau|_Linfty/sll 1$ holds in the regime $|tau|_
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The problem of robustness under location deformations for deep convolutional
neural networks (DCNNs) is of great theoretical and practical interest. This
issue has been studied in pioneering works, especially for scattering-type
architectures, for deformation vector fields $\tau(x)$ with some regularity -
at least $C^1$. Here we address this issue for any field $\tau\in
L^\infty(\mathbb{R}^d;\mathbb{R}^d)$, without any additional regularity
assumption, hence including the case of wild irregular deformations such as a
noise on the pixel location of an image. We prove that for signals in
multiresolution approximation spaces $U_s$ at scale $s$, whenever the network
is Lipschitz continuous (regardless of its architecture), stability in $L^2$
holds in the regime $\|\tau\|_{L^\infty}/s\ll 1$, essentially as a consequence
of the uncertainty principle. When $\|\tau\|_{L^\infty}/s\gg 1$ instability can
occur even for well-structured DCNNs such as the wavelet scattering networks,
and we provide a sharp upper bound for the asymptotic growth rate. The
stability results are then extended to signals in the Besov space
$B^{d/2}_{2,1}$ tailored to the given multiresolution approximation. We also
consider the case of more general time-frequency deformations. Finally, we
provide stochastic versions of the aforementioned results, namely we study the
issue of stability in mean when $\tau(x)$ is modeled as a random field (not
bounded, in general) with with identically distributed variables $|\tau(x)|$,
$x\in\mathbb{R}^d$.
Related papers
- Conditional regression for the Nonlinear Single-Variable Model [4.565636963872865]
We consider a model $F(X):=f(Pi_gamma):mathbbRdto[0,rmlen_gamma]$ where $Pi_gamma: [0,rmlen_gamma]tomathbbRd$ and $f:[0,rmlen_gamma]tomathbbR1$.
We propose a nonparametric estimator, based on conditional regression, and show that it can achieve the $one$-dimensional optimal min-max rate
arXiv Detail & Related papers (2024-11-14T18:53:51Z) - Learning with Norm Constrained, Over-parameterized, Two-layer Neural Networks [54.177130905659155]
Recent studies show that a reproducing kernel Hilbert space (RKHS) is not a suitable space to model functions by neural networks.
In this paper, we study a suitable function space for over- parameterized two-layer neural networks with bounded norms.
arXiv Detail & Related papers (2024-04-29T15:04:07Z) - A Unified Framework for Uniform Signal Recovery in Nonlinear Generative
Compressed Sensing [68.80803866919123]
Under nonlinear measurements, most prior results are non-uniform, i.e., they hold with high probability for a fixed $mathbfx*$ rather than for all $mathbfx*$ simultaneously.
Our framework accommodates GCS with 1-bit/uniformly quantized observations and single index models as canonical examples.
We also develop a concentration inequality that produces tighter bounds for product processes whose index sets have low metric entropy.
arXiv Detail & Related papers (2023-09-25T17:54:19Z) - Generalization and Stability of Interpolating Neural Networks with
Minimal Width [37.908159361149835]
We investigate the generalization and optimization of shallow neural-networks trained by gradient in the interpolating regime.
We prove the training loss number minimizations $m=Omega(log4 (n))$ neurons and neurons $Tapprox n$.
With $m=Omega(log4 (n))$ neurons and $Tapprox n$, we bound the test loss training by $tildeO (1/)$.
arXiv Detail & Related papers (2023-02-18T05:06:15Z) - Neural Networks Efficiently Learn Low-Dimensional Representations with
SGD [22.703825902761405]
We show that SGD-trained ReLU NNs can learn a single-index target of the form $y=f(langleboldsymbolu,boldsymbolxrangle) + epsilon$ by recovering the principal direction.
We also provide compress guarantees for NNs using the approximate low-rank structure produced by SGD.
arXiv Detail & Related papers (2022-09-29T15:29:10Z) - A Law of Robustness beyond Isoperimetry [84.33752026418045]
We prove a Lipschitzness lower bound $Omega(sqrtn/p)$ of robustness of interpolating neural network parameters on arbitrary distributions.
We then show the potential benefit of overparametrization for smooth data when $n=mathrmpoly(d)$.
We disprove the potential existence of an $O(1)$-Lipschitz robust interpolating function when $n=exp(omega(d))$.
arXiv Detail & Related papers (2022-02-23T16:10:23Z) - Large-time asymptotics in deep learning [0.0]
We consider the impact of the final time $T$ (which may indicate the depth of a corresponding ResNet) in training.
For the classical $L2$--regularized empirical risk minimization problem, we show that the training error is at most of the order $mathcalOleft(frac1Tright)$.
In the setting of $ellp$--distance losses, we prove that both the training error and the optimal parameters are at most of the order $mathcalOleft(e-mu
arXiv Detail & Related papers (2020-08-06T07:33:17Z) - Optimal Robust Linear Regression in Nearly Linear Time [97.11565882347772]
We study the problem of high-dimensional robust linear regression where a learner is given access to $n$ samples from the generative model $Y = langle X,w* rangle + epsilon$
We propose estimators for this problem under two settings: (i) $X$ is L4-L2 hypercontractive, $mathbbE [XXtop]$ has bounded condition number and $epsilon$ has bounded variance and (ii) $X$ is sub-Gaussian with identity second moment and $epsilon$ is
arXiv Detail & Related papers (2020-07-16T06:44:44Z) - Agnostic Learning of a Single Neuron with Gradient Descent [92.7662890047311]
We consider the problem of learning the best-fitting single neuron as measured by the expected square loss.
For the ReLU activation, our population risk guarantee is $O(mathsfOPT1/2)+epsilon$.
For the ReLU activation, our population risk guarantee is $O(mathsfOPT1/2)+epsilon$.
arXiv Detail & Related papers (2020-05-29T07:20:35Z) - Curse of Dimensionality on Randomized Smoothing for Certifiable
Robustness [151.67113334248464]
We show that extending the smoothing technique to defend against other attack models can be challenging.
We present experimental results on CIFAR to validate our theory.
arXiv Detail & Related papers (2020-02-08T22:02:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.