Neural Power Units
- URL: http://arxiv.org/abs/2006.01681v4
- Date: Thu, 17 Dec 2020 14:40:50 GMT
- Title: Neural Power Units
- Authors: Niklas Heim, Tom\'a\v{s} Pevn\'y, V\'aclav \v{S}m\'idl
- Abstract summary: We introduce the Neural Power Unit (NPU) that operates on the full domain of real numbers and is capable of learning arbitrary power functions in a single layer.
We show that the NPUs outperform their competitors in terms of accuracy and sparsity on artificial arithmetic datasets.
- Score: 1.7188280334580197
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Conventional Neural Networks can approximate simple arithmetic operations,
but fail to generalize beyond the range of numbers that were seen during
training. Neural Arithmetic Units aim to overcome this difficulty, but current
arithmetic units are either limited to operate on positive numbers or can only
represent a subset of arithmetic operations. We introduce the Neural Power Unit
(NPU) that operates on the full domain of real numbers and is capable of
learning arbitrary power functions in a single layer. The NPU thus fixes the
shortcomings of existing arithmetic units and extends their expressivity. We
achieve this by using complex arithmetic without requiring a conversion of the
network to complex numbers. A simplification of the unit to the RealNPU yields
a highly transparent model. We show that the NPUs outperform their competitors
in terms of accuracy and sparsity on artificial arithmetic datasets, and that
the RealNPU can discover the governing equations of a dynamical system only
from data.
Related papers
- Reverse That Number! Decoding Order Matters in Arithmetic Learning [49.5504492920404]
Our work introduces a novel strategy that reevaluates the digit order by prioritizing output from the least significant digit.
Compared to the previous state-of-the-art (SOTA) method, our findings reveal an overall improvement of in accuracy while requiring only a third of the tokens typically used during training.
arXiv Detail & Related papers (2024-03-09T09:04:53Z) - Expressive Power of ReLU and Step Networks under Floating-Point Operations [11.29958155597398]
We show that neural networks using a binary threshold unit or ReLU can memorize any finite input/output pairs.
We also show similar results on memorization and universal approximation when floating-point operations use finite bits for both significand and exponent.
arXiv Detail & Related papers (2024-01-26T05:59:40Z) - Intelligence Processing Units Accelerate Neuromorphic Learning [52.952192990802345]
Spiking neural networks (SNNs) have achieved orders of magnitude improvement in terms of energy consumption and latency.
We present an IPU-optimized release of our custom SNN Python package, snnTorch.
arXiv Detail & Related papers (2022-11-19T15:44:08Z) - Dynamic Inference with Neural Interpreters [72.90231306252007]
We present Neural Interpreters, an architecture that factorizes inference in a self-attention network as a system of modules.
inputs to the model are routed through a sequence of functions in a way that is end-to-end learned.
We show that Neural Interpreters perform on par with the vision transformer using fewer parameters, while being transferrable to a new task in a sample efficient manner.
arXiv Detail & Related papers (2021-10-12T23:22:45Z) - Learning Division with Neural Arithmetic Logic Modules [2.019622939313173]
We show that robustly learning division in a systematic manner remains a challenge even at the simplest level of dividing two numbers.
We propose two novel approaches for division which we call the Neural Reciprocal Unit (NRU) and the Neural Multiplicative Reciprocal Unit (NMRU)
arXiv Detail & Related papers (2021-10-11T11:56:57Z) - Recognizing and Verifying Mathematical Equations using Multiplicative
Differential Neural Units [86.9207811656179]
We show that memory-augmented neural networks (NNs) can achieve higher-order, memory-augmented extrapolation, stable performance, and faster convergence.
Our models achieve a 1.53% average improvement over current state-of-the-art methods in equation verification and achieve a 2.22% Top-1 average accuracy and 2.96% Top-5 average accuracy for equation completion.
arXiv Detail & Related papers (2021-04-07T03:50:11Z) - PAC-learning gains of Turing machines over circuits and neural networks [1.4502611532302039]
We study the potential gains in sample efficiency that can bring in the principle of minimum description length.
We use Turing machines to represent universal models and circuits.
We highlight close relationships between classical open problems in Circuit Complexity and the tightness of these.
arXiv Detail & Related papers (2021-03-23T17:03:10Z) - Self-Organized Operational Neural Networks with Generative Neurons [87.32169414230822]
ONNs are heterogenous networks with a generalized neuron model that can encapsulate any set of non-linear operators.
We propose Self-organized ONNs (Self-ONNs) with generative neurons that have the ability to adapt (optimize) the nodal operator of each connection.
arXiv Detail & Related papers (2020-04-24T14:37:56Z) - iNALU: Improved Neural Arithmetic Logic Unit [2.331160520377439]
The recently proposed Neural Arithmetic Logic Unit (NALU) is a novel neural architecture which is able to explicitly represent the mathematical relationships by the units of the network to learn operations such as summation, subtraction or multiplication.
We show that our model solves stability issues and outperforms the original NALU model in means of arithmetic precision and convergence.
arXiv Detail & Related papers (2020-03-17T10:37:22Z) - Neural Arithmetic Units [84.65228064780744]
Neural networks can approximate complex functions, but they struggle to perform exact arithmetic operations over real numbers.
We present two new neural network components: the Neural Addition Unit (NAU), which can learn exact addition and subtraction, and the Neural multiplication Unit (NMU), which can multiply subsets of a vector.
Our proposed units NAU and NMU, compared with previous neural units, converge more consistently, have fewer parameters, learn faster, can converge for larger hidden sizes, obtain sparse and meaningful weights, and can extrapolate to negative and small values.
arXiv Detail & Related papers (2020-01-14T19:35:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.