Training ReLU networks to high uniform accuracy is intractable
- URL: http://arxiv.org/abs/2205.13531v1
- Date: Thu, 26 May 2022 17:50:55 GMT
- Title: Training ReLU networks to high uniform accuracy is intractable
- Authors: Julius Berner, Philipp Grohs, Felix Voigtlaender
- Abstract summary: We quantify the number of training samples needed for any conceivable training algorithm to guarantee a given uniform accuracy.
We conclude that the training of ReLU neural networks to high uniform accuracy is intractable.
- Score: 7.723983475351976
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Statistical learning theory provides bounds on the necessary number of
training samples needed to reach a prescribed accuracy in a learning problem
formulated over a given target class. This accuracy is typically measured in
terms of a generalization error, that is, an expected value of a given loss
function. However, for several applications -- for example in a
security-critical context or for problems in the computational sciences --
accuracy in this sense is not sufficient. In such cases, one would like to have
guarantees for high accuracy on every input value, that is, with respect to the
uniform norm. In this paper we precisely quantify the number of training
samples needed for any conceivable training algorithm to guarantee a given
uniform accuracy on any learning problem formulated over target classes
containing (or consisting of) ReLU neural networks of a prescribed
architecture. We prove that, under very general assumptions, the minimal number
of training samples for this task scales exponentially both in the depth and
the input dimension of the network architecture. As a corollary we conclude
that the training of ReLU neural networks to high uniform accuracy is
intractable. In a security-critical context this points to the fact that deep
learning based systems are prone to being fooled by a possible adversary. We
corroborate our theoretical findings by numerical results.
Related papers
- A Fresh Take on Stale Embeddings: Improving Dense Retriever Training with Corrector Networks [81.2624272756733]
In dense retrieval, deep encoders provide embeddings for both inputs and targets.
We train a small parametric corrector network that adjusts stale cached target embeddings.
Our approach matches state-of-the-art results even when no target embedding updates are made during training.
arXiv Detail & Related papers (2024-09-03T13:29:13Z) - The Boundaries of Verifiable Accuracy, Robustness, and Generalisation in Deep Learning [71.14237199051276]
We consider classical distribution-agnostic framework and algorithms minimising empirical risks.
We show that there is a large family of tasks for which computing and verifying ideal stable and accurate neural networks is extremely challenging.
arXiv Detail & Related papers (2023-09-13T16:33:27Z) - Guaranteed Approximation Bounds for Mixed-Precision Neural Operators [83.64404557466528]
We build on intuition that neural operator learning inherently induces an approximation error.
We show that our approach reduces GPU memory usage by up to 50% and improves throughput by 58% with little or no reduction in accuracy.
arXiv Detail & Related papers (2023-07-27T17:42:06Z) - Efficient Uncertainty Quantification and Reduction for
Over-Parameterized Neural Networks [23.7125322065694]
Uncertainty quantification (UQ) is important for reliability assessment and enhancement of machine learning models.
We create statistically guaranteed schemes to principally emphcharacterize, and emphremove, the uncertainty of over- parameterized neural networks.
In particular, our approach, based on what we call a procedural-noise-correcting (PNC) predictor, removes the procedural uncertainty by using only emphone auxiliary network that is trained on a suitably labeled dataset.
arXiv Detail & Related papers (2023-06-09T05:15:53Z) - Bridging Precision and Confidence: A Train-Time Loss for Calibrating
Object Detection [58.789823426981044]
We propose a novel auxiliary loss formulation that aims to align the class confidence of bounding boxes with the accurateness of predictions.
Our results reveal that our train-time loss surpasses strong calibration baselines in reducing calibration error for both in and out-domain scenarios.
arXiv Detail & Related papers (2023-03-25T08:56:21Z) - Post-training Quantization for Neural Networks with Provable Guarantees [9.58246628652846]
We modify a post-training neural-network quantization method, GPFQ, that is based on a greedy path-following mechanism.
We prove that for quantizing a single-layer network, the relative square error essentially decays linearly in the number of weights.
arXiv Detail & Related papers (2022-01-26T18:47:38Z) - Why Lottery Ticket Wins? A Theoretical Perspective of Sample Complexity
on Pruned Neural Networks [79.74580058178594]
We analyze the performance of training a pruned neural network by analyzing the geometric structure of the objective function.
We show that the convex region near a desirable model with guaranteed generalization enlarges as the neural network model is pruned.
arXiv Detail & Related papers (2021-10-12T01:11:07Z) - LCS: Learning Compressible Subspaces for Adaptive Network Compression at
Inference Time [57.52251547365967]
We propose a method for training a "compressible subspace" of neural networks that contains a fine-grained spectrum of models.
We present results for achieving arbitrarily fine-grained accuracy-efficiency trade-offs at inference time for structured and unstructured sparsity.
Our algorithm extends to quantization at variable bit widths, achieving accuracy on par with individually trained networks.
arXiv Detail & Related papers (2021-10-08T17:03:34Z) - A Theoretical-Empirical Approach to Estimating Sample Complexity of DNNs [11.152761263415046]
This paper focuses on understanding how the generalization error scales with the amount of the training data for deep neural networks (DNNs)
We derive estimates of the generalization error that hold for deep networks and do not rely on unattainable capacity measures.
arXiv Detail & Related papers (2021-05-05T05:14:08Z) - Scalable Verification of Quantized Neural Networks (Technical Report) [14.04927063847749]
We show that bit-exact implementation of quantized neural networks with bit-vector specifications is PSPACE-hard.
We propose three techniques for making SMT-based verification of quantized neural networks more scalable.
arXiv Detail & Related papers (2020-12-15T10:05:37Z) - Confidence-Aware Learning for Deep Neural Networks [4.9812879456945]
We propose a method of training deep neural networks with a novel loss function, named Correctness Ranking Loss.
It regularizes class probabilities explicitly to be better confidence estimates in terms of ordinal ranking according to confidence.
It has almost the same computational costs for training as conventional deep classifiers and outputs reliable predictions by a single inference.
arXiv Detail & Related papers (2020-07-03T02:00:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.