Related papers: Training ReLU networks to high uniform accuracy is intractable

Training ReLU networks to high uniform accuracy is intractable

URL: http://arxiv.org/abs/2205.13531v1
Date: Thu, 26 May 2022 17:50:55 GMT
Title: Training ReLU networks to high uniform accuracy is intractable
Authors: Julius Berner, Philipp Grohs, Felix Voigtlaender
Abstract summary: We quantify the number of training samples needed for any conceivable training algorithm to guarantee a given uniform accuracy. We conclude that the training of ReLU neural networks to high uniform accuracy is intractable.
Score: 7.723983475351976
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Statistical learning theory provides bounds on the necessary number of training samples needed to reach a prescribed accuracy in a learning problem formulated over a given target class. This accuracy is typically measured in terms of a generalization error, that is, an expected value of a given loss function. However, for several applications -- for example in a security-critical context or for problems in the computational sciences -- accuracy in this sense is not sufficient. In such cases, one would like to have guarantees for high accuracy on every input value, that is, with respect to the uniform norm. In this paper we precisely quantify the number of training samples needed for any conceivable training algorithm to guarantee a given uniform accuracy on any learning problem formulated over target classes containing (or consisting of) ReLU neural networks of a prescribed architecture. We prove that, under very general assumptions, the minimal number of training samples for this task scales exponentially both in the depth and the input dimension of the network architecture. As a corollary we conclude that the training of ReLU neural networks to high uniform accuracy is intractable. In a security-critical context this points to the fact that deep learning based systems are prone to being fooled by a possible adversary. We corroborate our theoretical findings by numerical results.

Related papers

A Fresh Take on Stale Embeddings: Improving Dense Retriever Training with Corrector Networks [81.2624272756733]
In dense retrieval, deep encoders provide embeddings for both inputs and targets. We train a small parametric corrector network that adjusts stale cached target embeddings. Our approach matches state-of-the-art results even when no target embedding updates are made during training.
arXiv Detail & Related papers (2024-09-03T13:29:13Z)
The Boundaries of Verifiable Accuracy, Robustness, and Generalisation in Deep Learning [71.14237199051276]
We consider classical distribution-agnostic framework and algorithms minimising empirical risks. We show that there is a large family of tasks for which computing and verifying ideal stable and accurate neural networks is extremely challenging.
arXiv Detail & Related papers (2023-09-13T16:33:27Z)
Guaranteed Approximation Bounds for Mixed-Precision Neural Operators [83.64404557466528]
We build on intuition that neural operator learning inherently induces an approximation error. We show that our approach reduces GPU memory usage by up to 50% and improves throughput by 58% with little or no reduction in accuracy.
arXiv Detail & Related papers (2023-07-27T17:42:06Z)
Efficient Uncertainty Quantification and Reduction for Over-Parameterized Neural Networks [23.7125322065694]
Uncertainty quantification (UQ) is important for reliability assessment and enhancement of machine learning models. We create statistically guaranteed schemes to principally emphcharacterize, and emphremove, the uncertainty of over- parameterized neural networks. In particular, our approach, based on what we call a procedural-noise-correcting (PNC) predictor, removes the procedural uncertainty by using only emphone auxiliary network that is trained on a suitably labeled dataset.
arXiv Detail & Related papers (2023-06-09T05:15:53Z)
Bridging Precision and Confidence: A Train-Time Loss for Calibrating Object Detection [58.789823426981044]
We propose a novel auxiliary loss formulation that aims to align the class confidence of bounding boxes with the accurateness of predictions. Our results reveal that our train-time loss surpasses strong calibration baselines in reducing calibration error for both in and out-domain scenarios.
arXiv Detail & Related papers (2023-03-25T08:56:21Z)
Post-training Quantization for Neural Networks with Provable Guarantees [9.58246628652846]
We modify a post-training neural-network quantization method, GPFQ, that is based on a greedy path-following mechanism. We prove that for quantizing a single-layer network, the relative square error essentially decays linearly in the number of weights.
arXiv Detail & Related papers (2022-01-26T18:47:38Z)
Why Lottery Ticket Wins? A Theoretical Perspective of Sample Complexity on Pruned Neural Networks [79.74580058178594]
We analyze the performance of training a pruned neural network by analyzing the geometric structure of the objective function. We show that the convex region near a desirable model with guaranteed generalization enlarges as the neural network model is pruned.
arXiv Detail & Related papers (2021-10-12T01:11:07Z)
LCS: Learning Compressible Subspaces for Adaptive Network Compression at Inference Time [57.52251547365967]
We propose a method for training a "compressible subspace" of neural networks that contains a fine-grained spectrum of models. We present results for achieving arbitrarily fine-grained accuracy-efficiency trade-offs at inference time for structured and unstructured sparsity. Our algorithm extends to quantization at variable bit widths, achieving accuracy on par with individually trained networks.
arXiv Detail & Related papers (2021-10-08T17:03:34Z)
A Theoretical-Empirical Approach to Estimating Sample Complexity of DNNs [11.152761263415046]
This paper focuses on understanding how the generalization error scales with the amount of the training data for deep neural networks (DNNs) We derive estimates of the generalization error that hold for deep networks and do not rely on unattainable capacity measures.
arXiv Detail & Related papers (2021-05-05T05:14:08Z)
Scalable Verification of Quantized Neural Networks (Technical Report) [14.04927063847749]
We show that bit-exact implementation of quantized neural networks with bit-vector specifications is PSPACE-hard. We propose three techniques for making SMT-based verification of quantized neural networks more scalable.
arXiv Detail & Related papers (2020-12-15T10:05:37Z)
Confidence-Aware Learning for Deep Neural Networks [4.9812879456945]
We propose a method of training deep neural networks with a novel loss function, named Correctness Ranking Loss. It regularizes class probabilities explicitly to be better confidence estimates in terms of ordinal ranking according to confidence. It has almost the same computational costs for training as conventional deep classifiers and outputs reliable predictions by a single inference.
arXiv Detail & Related papers (2020-07-03T02:00:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.