Related papers: Approximation speed of quantized vs. unquantized ReLU neural networks and beyond

Approximation speed of quantized vs. unquantized ReLU neural networks and beyond

URL: http://arxiv.org/abs/2205.11874v1
Date: Tue, 24 May 2022 07:48:12 GMT
Title: Approximation speed of quantized vs. unquantized ReLU neural networks and beyond
Authors: Antoine Gonon (DANTE, ARIC), Nicolas Brisebarre (ARIC), R\'emi Gribonval (DANTE), Elisa Riccietti (DANTE)
Abstract summary: We consider general approximation families encompassing ReLU neural networks. We use $infty$-encodability to guarantee that ReLU networks can be uniformly quantized. We also prove that ReLU networks share a common limitation with many other approximation families.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We consider general approximation families encompassing ReLU neural networks. On the one hand, we introduce a new property, that we call $\infty$-encodability, which lays a framework that we use (i) to guarantee that ReLU networks can be uniformly quantized and still have approximation speeds comparable to unquantized ones, and (ii) to prove that ReLU networks share a common limitation with many other approximation families: the approximation speed of a set C is bounded from above by an encoding complexity of C (a complexity well-known for many C's). The property of $\infty$-encodability allows us to unify and generalize known results in which it was implicitly used. On the other hand, we give lower and upper bounds on the Lipschitz constant of the mapping that associates the weights of a network to the function they represent in L^p. It is given in terms of the width, the depth of the network and a bound on the weight's norm, and it is based on well-known upper bounds on the Lipschitz constants of the functions represented by ReLU networks. This allows us to recover known results, to establish new bounds on covering numbers, and to characterize the accuracy of naive uniform quantization of ReLU networks.

Related papers

Covering Numbers for Deep ReLU Networks with Applications to Function Approximation and Nonparametric Regression [4.297070083645049]
We develop tight (up to a multiplicative constant) lower and upper bounds on the covering numbers of fully-connected networks. Thanks to the tightness of the bounds, a fundamental understanding of the impact of sparsity, quantization, bounded vs. unbounded weights, and network output truncation can be developed.
arXiv Detail & Related papers (2024-10-08T21:23:14Z)
Approximation Error and Complexity Bounds for ReLU Networks on Low-Regular Function Spaces [0.0]
We consider the approximation of a large class of bounded functions, with minimal regularity assumptions, by ReLU neural networks. We show that the approximation error can be bounded from above by a quantity proportional to the uniform norm of the target function.
arXiv Detail & Related papers (2024-05-10T14:31:58Z)
Three Quantization Regimes for ReLU Networks [3.823356975862005]
We establish the fundamental limits in the approximation of Lipschitz functions by deep ReLU neural networks with finite-precision weights. In the proper-quantization regime, neural networks exhibit memory-optimality in the approximation of Lipschitz functions.
arXiv Detail & Related papers (2024-05-03T09:27:31Z)
Generalization of Scaled Deep ResNets in the Mean-Field Regime [55.77054255101667]
We investigate emphscaled ResNet in the limit of infinitely deep and wide neural networks. Our results offer new insights into the generalization ability of deep ResNet beyond the lazy training regime.
arXiv Detail & Related papers (2024-03-14T21:48:00Z)
Fixing the NTK: From Neural Network Linearizations to Exact Convex Programs [63.768739279562105]
We show that for a particular choice of mask weights that do not depend on the learning targets, this kernel is equivalent to the NTK of the gated ReLU network on the training data. A consequence of this lack of dependence on the targets is that the NTK cannot perform better than the optimal MKL kernel on the training set.
arXiv Detail & Related papers (2023-09-26T17:42:52Z)
Neural Networks with Sparse Activation Induced by Large Bias: Tighter Analysis with Bias-Generalized NTK [86.45209429863858]
We study training one-hidden-layer ReLU networks in the neural tangent kernel (NTK) regime. We show that the neural networks possess a different limiting kernel which we call textitbias-generalized NTK We also study various properties of the neural networks with this new kernel.
arXiv Detail & Related papers (2023-01-01T02:11:39Z)
Training Certifiably Robust Neural Networks with Efficient Local Lipschitz Bounds [99.23098204458336]
Certified robustness is a desirable property for deep neural networks in safety-critical applications. We show that our method consistently outperforms state-of-the-art methods on MNIST and TinyNet datasets.
arXiv Detail & Related papers (2021-11-02T06:44:10Z)
Analytical bounds on the local Lipschitz constants of ReLU networks [0.0]
We do so by deriving Lipschitz constants and bounds for ReLU, affine-ReLU, and max pooling functions. Our method produces the largest known bounds on minimum adversarial perturbations for large networks such as AlexNet and VGG-16.
arXiv Detail & Related papers (2021-04-29T21:57:47Z)
A Convergence Theory Towards Practical Over-parameterized Deep Neural Networks [56.084798078072396]
We take a step towards closing the gap between theory and practice by significantly improving the known theoretical bounds on both the network width and the convergence time. We show that convergence to a global minimum is guaranteed for networks with quadratic widths in the sample size and linear in their depth at a time logarithmic in both. Our analysis and convergence bounds are derived via the construction of a surrogate network with fixed activation patterns that can be transformed at any time to an equivalent ReLU network of a reasonable size.
arXiv Detail & Related papers (2021-01-12T00:40:45Z)
Analytical bounds on the local Lipschitz constants of affine-ReLU functions [0.0]
We mathematically determine upper bounds on the local Lipschitz constant of an affine-ReLU function. We show how these bounds can be combined to determine a bound on an entire network. We show several examples by applying our results to AlexNet, as well as several smaller networks based on the MNIST and CIFAR-10 datasets.
arXiv Detail & Related papers (2020-08-14T00:23:21Z)
Being Bayesian, Even Just a Bit, Fixes Overconfidence in ReLU Networks [65.24701908364383]
We show that a sufficient condition for a uncertainty on a ReLU network is "to be a bit Bayesian calibrated" We further validate these findings empirically via various standard experiments using common deep ReLU networks and Laplace approximations.
arXiv Detail & Related papers (2020-02-24T08:52:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.