Expressive Power of ReLU and Step Networks under Floating-Point Operations
- URL: http://arxiv.org/abs/2401.15121v2
- Date: Tue, 16 Jul 2024 01:00:31 GMT
- Title: Expressive Power of ReLU and Step Networks under Floating-Point Operations
- Authors: Yeachan Park, Geonho Hwang, Wonyeol Lee, Sejun Park,
- Abstract summary: We show that neural networks using a binary threshold unit or ReLU can memorize any finite input/output pairs.
We also show similar results on memorization and universal approximation when floating-point operations use finite bits for both significand and exponent.
- Score: 11.29958155597398
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The study of the expressive power of neural networks has investigated the fundamental limits of neural networks. Most existing results assume real-valued inputs and parameters as well as exact operations during the evaluation of neural networks. However, neural networks are typically executed on computers that can only represent a tiny subset of the reals and apply inexact operations, i.e., most existing results do not apply to neural networks used in practice. In this work, we analyze the expressive power of neural networks under a more realistic setup: when we use floating-point numbers and operations as in practice. Our first set of results assumes floating-point operations where the significand of a float is represented by finite bits but its exponent can take any integer value. Under this setup, we show that neural networks using a binary threshold unit or ReLU can memorize any finite input/output pairs and can approximate any continuous function within an arbitrary error. In particular, the number of parameters in our constructions for universal approximation and memorization coincides with that in classical results assuming exact mathematical operations. We also show similar results on memorization and universal approximation when floating-point operations use finite bits for both significand and exponent; these results are applicable to many popular floating-point formats such as those defined in the IEEE 754 standard (e.g., 32-bit single-precision format) and bfloat16.
Related papers
- Predictions Based on Pixel Data: Insights from PDEs and Finite Differences [0.0]
This paper deals with approximation of time sequences where each observation is a matrix.
We show that with relatively small networks, we can represent exactly a class of numerical discretizations of PDEs based on the method of lines.
Our network architecture is inspired by those typically adopted in the approximation of time sequences.
arXiv Detail & Related papers (2023-05-01T08:54:45Z) - Memorization Capacity of Neural Networks with Conditional Computation [14.048989759890475]
We show that memorizing a collection of $n$ input-output relationships can be accomplished via a neural network with $O(sqrtn)$ neurons.
Using a conditional ReLU network, we show that the same task can be accomplished using only $O(log n)$ operations per input.
arXiv Detail & Related papers (2023-03-20T16:33:17Z) - Provable Data Subset Selection For Efficient Neural Network Training [73.34254513162898]
We introduce the first algorithm to construct coresets for emphRBFNNs, i.e., small weighted subsets that approximate the loss of the input data on any radial basis function network.
We then perform empirical evaluations on function approximation and dataset subset selection on popular network architectures and data sets.
arXiv Detail & Related papers (2023-03-09T10:08:34Z) - Fixed-Point Code Synthesis For Neural Networks [0.0]
A new technique is introduced to tune the formats (precision) of already trained neural networks using fixed-point arithmetic.
The new optimized neural network computes the output with fixed-point numbers without modifying the accuracy up to a threshold fixed by the user.
arXiv Detail & Related papers (2022-02-04T12:02:54Z) - Statistically Meaningful Approximation: a Case Study on Approximating
Turing Machines with Transformers [50.85524803885483]
This work proposes a formal definition of statistically meaningful (SM) approximation which requires the approximating network to exhibit good statistical learnability.
We study SM approximation for two function classes: circuits and Turing machines.
arXiv Detail & Related papers (2021-07-28T04:28:55Z) - The Connection Between Approximation, Depth Separation and Learnability
in Neural Networks [70.55686685872008]
We study the connection between learnability and approximation capacity.
We show that learnability with deep networks of a target function depends on the ability of simpler classes to approximate the target.
arXiv Detail & Related papers (2021-01-31T11:32:30Z) - Binary Graph Neural Networks [69.51765073772226]
Graph Neural Networks (GNNs) have emerged as a powerful and flexible framework for representation learning on irregular data.
In this paper, we present and evaluate different strategies for the binarization of graph neural networks.
We show that through careful design of the models, and control of the training process, binary graph neural networks can be trained at only a moderate cost in accuracy on challenging benchmarks.
arXiv Detail & Related papers (2020-12-31T18:48:58Z) - Deep Neural Network Training without Multiplications [0.0]
We show that ResNet can be trained using this operation with competitive classification accuracy.
This method will enable eliminating the multiplications in deep neural-network training and inference.
arXiv Detail & Related papers (2020-12-07T05:40:50Z) - NITI: Training Integer Neural Networks Using Integer-only Arithmetic [4.361357921751159]
We present NITI, an efficient deep neural network training framework that computes exclusively with integer arithmetic.
A proof-of-concept open-source software implementation of NITI that utilizes native 8-bit integer operations is presented.
NITI achieves negligible accuracy degradation on the MNIST and CIFAR10 datasets using 8-bit integer storage and computation.
arXiv Detail & Related papers (2020-09-28T07:41:36Z) - Efficient Integer-Arithmetic-Only Convolutional Neural Networks [87.01739569518513]
We replace conventional ReLU with Bounded ReLU and find that the decline is due to activation quantization.
Our integer networks achieve equivalent performance as the corresponding FPN networks, but have only 1/4 memory cost and run 2x faster on modern GPU.
arXiv Detail & Related papers (2020-06-21T08:23:03Z) - Neural Arithmetic Units [84.65228064780744]
Neural networks can approximate complex functions, but they struggle to perform exact arithmetic operations over real numbers.
We present two new neural network components: the Neural Addition Unit (NAU), which can learn exact addition and subtraction, and the Neural multiplication Unit (NMU), which can multiply subsets of a vector.
Our proposed units NAU and NMU, compared with previous neural units, converge more consistently, have fewer parameters, learn faster, can converge for larger hidden sizes, obtain sparse and meaningful weights, and can extrapolate to negative and small values.
arXiv Detail & Related papers (2020-01-14T19:35:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.