The mathematics of adversarial attacks in AI -- Why deep learning is
unstable despite the existence of stable neural networks
- URL: http://arxiv.org/abs/2109.06098v1
- Date: Mon, 13 Sep 2021 16:19:25 GMT
- Title: The mathematics of adversarial attacks in AI -- Why deep learning is
unstable despite the existence of stable neural networks
- Authors: Alexander Bastounis, Anders C Hansen, Verner Vla\v{c}i\'c
- Abstract summary: We prove that any training procedure based on training neural networks for classification problems with a fixed architecture will yield neural networks that are either inaccurate or unstable (if accurate)
The key is that the stable and accurate neural networks must have variable dimensions depending on the input, in particular, variable dimensions is a necessary condition for stability.
Our result points towards the paradox that accurate and stable neural networks exist, however, modern algorithms do not compute them.
- Score: 69.33657875725747
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The unprecedented success of deep learning (DL) makes it unchallenged when it
comes to classification problems. However, it is well established that the
current DL methodology produces universally unstable neural networks (NNs). The
instability problem has caused an enormous research effort -- with a vast
literature on so-called adversarial attacks -- yet there has been no solution
to the problem. Our paper addresses why there has been no solution to the
problem, as we prove the following mathematical paradox: any training procedure
based on training neural networks for classification problems with a fixed
architecture will yield neural networks that are either inaccurate or unstable
(if accurate) -- despite the provable existence of both accurate and stable
neural networks for the same classification problems. The key is that the
stable and accurate neural networks must have variable dimensions depending on
the input, in particular, variable dimensions is a necessary condition for
stability.
Our result points towards the paradox that accurate and stable neural
networks exist, however, modern algorithms do not compute them. This yields the
question: if the existence of neural networks with desirable properties can be
proven, can one also find algorithms that compute them? There are cases in
mathematics where provable existence implies computability, but will this be
the case for neural networks? The contrary is true, as we demonstrate how
neural networks can provably exist as approximate minimisers to standard
optimisation problems with standard cost functions, however, no randomised
algorithm can compute them with probability better than 1/2.
Related papers
- Verified Neural Compressed Sensing [58.98637799432153]
We develop the first (to the best of our knowledge) provably correct neural networks for a precise computational task.
We show that for modest problem dimensions (up to 50), we can train neural networks that provably recover a sparse vector from linear and binarized linear measurements.
We show that the complexity of the network can be adapted to the problem difficulty and solve problems where traditional compressed sensing methods are not known to provably work.
arXiv Detail & Related papers (2024-05-07T12:20:12Z) - Message Passing Variational Autoregressive Network for Solving Intractable Ising Models [6.261096199903392]
Many deep neural networks have been used to solve Ising models, including autoregressive neural networks, convolutional neural networks, recurrent neural networks, and graph neural networks.
Here we propose a variational autoregressive architecture with a message passing mechanism, which can effectively utilize the interactions between spin variables.
The new network trained under an annealing framework outperforms existing methods in solving several prototypical Ising spin Hamiltonians, especially for larger spin systems at low temperatures.
arXiv Detail & Related papers (2024-04-09T11:27:07Z) - The Boundaries of Verifiable Accuracy, Robustness, and Generalisation in Deep Learning [71.14237199051276]
We consider classical distribution-agnostic framework and algorithms minimising empirical risks.
We show that there is a large family of tasks for which computing and verifying ideal stable and accurate neural networks is extremely challenging.
arXiv Detail & Related papers (2023-09-13T16:33:27Z) - Benign Overfitting for Two-layer ReLU Convolutional Neural Networks [60.19739010031304]
We establish algorithm-dependent risk bounds for learning two-layer ReLU convolutional neural networks with label-flipping noise.
We show that, under mild conditions, the neural network trained by gradient descent can achieve near-zero training loss and Bayes optimal test risk.
arXiv Detail & Related papers (2023-03-07T18:59:38Z) - Consistency of Neural Networks with Regularization [0.0]
This paper proposes the general framework of neural networks with regularization and prove its consistency.
Two types of activation functions: hyperbolic function(Tanh) and rectified linear unit(ReLU) have been taken into consideration.
arXiv Detail & Related papers (2022-06-22T23:33:39Z) - The Separation Capacity of Random Neural Networks [78.25060223808936]
We show that a sufficiently large two-layer ReLU-network with standard Gaussian weights and uniformly distributed biases can solve this problem with high probability.
We quantify the relevant structure of the data in terms of a novel notion of mutual complexity.
arXiv Detail & Related papers (2021-07-31T10:25:26Z) - Can stable and accurate neural networks be computed? -- On the barriers
of deep learning and Smale's 18th problem [0.5801044612920815]
Deep learning (DL) has had unprecedented success and is now entering scientific computing with full force.
DL suffers from a universal phenomenon: instability, despite universal approximating properties that often guarantee the existence of stable neural networks (NNs)
We show that there does not exist any algorithm, even randomised, that can train (or compute) such a NN.
We introduce Fast Iterative REstarted NETworks (FIRENETs), which we prove and numerically verify are stable.
arXiv Detail & Related papers (2021-01-20T19:04:17Z) - Artificial Neural Variability for Deep Learning: On Overfitting, Noise
Memorization, and Catastrophic Forgetting [135.0863818867184]
artificial neural variability (ANV) helps artificial neural networks learn some advantages from natural'' neural networks.
ANV plays as an implicit regularizer of the mutual information between the training data and the learned model.
It can effectively relieve overfitting, label noise memorization, and catastrophic forgetting at negligible costs.
arXiv Detail & Related papers (2020-11-12T06:06:33Z) - Bidirectionally Self-Normalizing Neural Networks [46.20979546004718]
We provide a rigorous result that shows, under mild conditions, how the vanishing/exploding gradients problem disappears with high probability if the neural networks have sufficient width.
Our main idea is to constrain both forward and backward signal propagation in a nonlinear neural network through a new class of activation functions.
arXiv Detail & Related papers (2020-06-22T12:07:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.