Improving the Robustness of Neural Multiplication Units with Reversible
Stochasticity
- URL: http://arxiv.org/abs/2211.05624v1
- Date: Thu, 10 Nov 2022 14:56:37 GMT
- Title: Improving the Robustness of Neural Multiplication Units with Reversible
Stochasticity
- Authors: Bhumika Mistry, Katayoun Farrahi, Jonathon Hare
- Abstract summary: Multilayer Perceptrons struggle to learn certain simple arithmetic tasks.
Specialist neural NMU (sNMU) is proposed to apply reversibleity, encouraging avoidance of such optima.
- Score: 2.4278445972594525
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Multilayer Perceptrons struggle to learn certain simple arithmetic tasks.
Specialist neural modules for arithmetic can outperform classical architectures
with gains in extrapolation, interpretability and convergence speeds, but are
highly sensitive to the training range. In this paper, we show that Neural
Multiplication Units (NMUs) are unable to reliably learn tasks as simple as
multiplying two inputs when given different training ranges. Causes of failure
are linked to inductive and input biases which encourage convergence to
solutions in undesirable optima. A solution, the stochastic NMU (sNMU), is
proposed to apply reversible stochasticity, encouraging avoidance of such
optima whilst converging to the true solution. Empirically, we show that
stochasticity provides improved robustness with the potential to improve
learned representations of upstream networks for numerical and image tasks.
Related papers
- Efficient kernel surrogates for neural network-based regression [0.8030359871216615]
We study the performance of the Conjugate Kernel (CK), an efficient approximation to the Neural Tangent Kernel (NTK)
We show that the CK performance is only marginally worse than that of the NTK and, in certain cases, is shown to be superior.
In addition to providing a theoretical grounding for using CKs instead of NTKs, our framework suggests a recipe for improving DNN accuracy inexpensively.
arXiv Detail & Related papers (2023-10-28T06:41:47Z) - A Multi-Head Ensemble Multi-Task Learning Approach for Dynamical
Computation Offloading [62.34538208323411]
We propose a multi-head ensemble multi-task learning (MEMTL) approach with a shared backbone and multiple prediction heads (PHs)
MEMTL outperforms benchmark methods in both the inference accuracy and mean square error without requiring additional training data.
arXiv Detail & Related papers (2023-09-02T11:01:16Z) - Single-model uncertainty quantification in neural network potentials
does not consistently outperform model ensembles [0.7499722271664145]
Neural networks (NNs) often assign high confidence to their predictions, even for points far out-of-distribution.
Uncertainty quantification (UQ) is a challenge when they are employed to model interatomic potentials in materials systems.
Differentiable UQ techniques can find new informative data and drive active learning loops for robust potentials.
arXiv Detail & Related papers (2023-05-02T19:41:17Z) - Decouple Graph Neural Networks: Train Multiple Simple GNNs Simultaneously Instead of One [60.5818387068983]
Graph neural networks (GNN) suffer from severe inefficiency.
We propose to decouple a multi-layer GNN as multiple simple modules for more efficient training.
We show that the proposed framework is highly efficient with reasonable performance.
arXiv Detail & Related papers (2023-04-20T07:21:32Z) - Comparative Analysis of Interval Reachability for Robust Implicit and
Feedforward Neural Networks [64.23331120621118]
We use interval reachability analysis to obtain robustness guarantees for implicit neural networks (INNs)
INNs are a class of implicit learning models that use implicit equations as layers.
We show that our approach performs at least as well as, and generally better than, applying state-of-the-art interval bound propagation methods to INNs.
arXiv Detail & Related papers (2022-04-01T03:31:27Z) - Efficient Model-Based Multi-Agent Mean-Field Reinforcement Learning [89.31889875864599]
We propose an efficient model-based reinforcement learning algorithm for learning in multi-agent systems.
Our main theoretical contributions are the first general regret bounds for model-based reinforcement learning for MFC.
We provide a practical parametrization of the core optimization problem.
arXiv Detail & Related papers (2021-07-08T18:01:02Z) - Recognizing and Verifying Mathematical Equations using Multiplicative
Differential Neural Units [86.9207811656179]
We show that memory-augmented neural networks (NNs) can achieve higher-order, memory-augmented extrapolation, stable performance, and faster convergence.
Our models achieve a 1.53% average improvement over current state-of-the-art methods in equation verification and achieve a 2.22% Top-1 average accuracy and 2.96% Top-5 average accuracy for equation completion.
arXiv Detail & Related papers (2021-04-07T03:50:11Z) - Neural Non-Rigid Tracking [26.41847163649205]
We introduce a novel, end-to-end learnable, differentiable non-rigid tracker.
We employ a convolutional neural network to predict dense correspondences and their confidences.
Compared to state-of-the-art approaches, our algorithm shows improved reconstruction performance.
arXiv Detail & Related papers (2020-06-23T18:00:39Z) - Neural Control Variates [71.42768823631918]
We show that a set of neural networks can face the challenge of finding a good approximation of the integrand.
We derive a theoretically optimal, variance-minimizing loss function, and propose an alternative, composite loss for stable online training in practice.
Specifically, we show that the learned light-field approximation is of sufficient quality for high-order bounces, allowing us to omit the error correction and thereby dramatically reduce the noise at the cost of negligible visible bias.
arXiv Detail & Related papers (2020-06-02T11:17:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.