Relational Weight Priors in Neural Networks for Abstract Pattern
Learning and Language Modelling
- URL: http://arxiv.org/abs/2103.06198v1
- Date: Wed, 10 Mar 2021 17:21:16 GMT
- Title: Relational Weight Priors in Neural Networks for Abstract Pattern
Learning and Language Modelling
- Authors: Radha Kopparti and Tillman Weyde
- Abstract summary: Abstract patterns are the best known examples of a hard problem for neural networks in terms of generalisation to unseen data.
It has been argued that these low-level problems demonstrate the inability of neural networks to learn systematically.
We propose Embedded Relation Based Patterns (ERBP) as a novel way to create a relational inductive bias that encourages learning equality and distance-based relations for abstract patterns.
- Score: 6.980076213134383
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep neural networks have become the dominant approach in natural language
processing (NLP). However, in recent years, it has become apparent that there
are shortcomings in systematicity that limit the performance and data
efficiency of deep learning in NLP. These shortcomings can be clearly shown in
lower-level artificial tasks, mostly on synthetic data. Abstract patterns are
the best known examples of a hard problem for neural networks in terms of
generalisation to unseen data. They are defined by relations between items,
such as equality, rather than their values. It has been argued that these
low-level problems demonstrate the inability of neural networks to learn
systematically. In this study, we propose Embedded Relation Based Patterns
(ERBP) as a novel way to create a relational inductive bias that encourages
learning equality and distance-based relations for abstract patterns. ERBP is
based on Relation Based Patterns (RBP), but modelled as a Bayesian prior on
network weights and implemented as a regularisation term in otherwise standard
network learning. ERBP is is easy to integrate into standard neural networks
and does not affect their learning capacity. In our experiments, ERBP priors
lead to almost perfect generalisation when learning abstract patterns from
synthetic noise-free sequences. ERBP also improves natural language models on
the word and character level and pitch prediction in melodies with RNN, GRU and
LSTM networks. We also find improvements in in the more complex tasks of
learning of graph edit distance and compositional sentence entailment. ERBP
consistently improves over RBP and over standard networks, showing that it
enables abstract pattern learning which contributes to performance in natural
language tasks.
Related papers
- Predictive Coding Networks and Inference Learning: Tutorial and Survey [0.7510165488300368]
Predictive coding networks (PCNs) are based on the neuroscientific framework of predictive coding.
Unlike traditional neural networks trained with backpropagation (BP), PCNs utilize inference learning (IL), a more biologically plausible algorithm.
As inherently probabilistic (graphical) latent variable models, PCNs provide a versatile framework for both supervised learning and unsupervised (generative) modeling.
arXiv Detail & Related papers (2024-07-04T18:39:20Z) - Neural networks trained with SGD learn distributions of increasing
complexity [78.30235086565388]
We show that neural networks trained using gradient descent initially classify their inputs using lower-order input statistics.
We then exploit higher-order statistics only later during training.
We discuss the relation of DSB to other simplicity biases and consider its implications for the principle of universality in learning.
arXiv Detail & Related papers (2022-11-21T15:27:22Z) - Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs.
By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z) - FF-NSL: Feed-Forward Neural-Symbolic Learner [70.978007919101]
This paper introduces a neural-symbolic learning framework, called Feed-Forward Neural-Symbolic Learner (FF-NSL)
FF-NSL integrates state-of-the-art ILP systems based on the Answer Set semantics, with neural networks, in order to learn interpretable hypotheses from labelled unstructured data.
arXiv Detail & Related papers (2021-06-24T15:38:34Z) - Persistent Homology Captures the Generalization of Neural Networks
Without A Validation Set [0.0]
We suggest studying the training of neural networks with Algebraic Topology, specifically Persistent Homology.
Using simplicial complex representations of neural networks, we study the PH diagram distance evolution on the neural network learning process.
Results show that the PH diagram distance between consecutive neural network states correlates with the validation accuracy.
arXiv Detail & Related papers (2021-05-31T09:17:31Z) - NSL: Hybrid Interpretable Learning From Noisy Raw Data [66.15862011405882]
This paper introduces a hybrid neural-symbolic learning framework, called NSL, that learns interpretable rules from labelled unstructured data.
NSL combines pre-trained neural networks for feature extraction with FastLAS, a state-of-the-art ILP system for rule learning under the answer set semantics.
We demonstrate that NSL is able to learn robust rules from MNIST data and achieve comparable or superior accuracy when compared to neural network and random forest baselines.
arXiv Detail & Related papers (2020-12-09T13:02:44Z) - How much complexity does an RNN architecture need to learn
syntax-sensitive dependencies? [9.248882589228089]
Long short-term memory (LSTM) networks are capable of encapsulating long-range dependencies.
Simple recurrent networks (SRNs) have generally been less successful at capturing long-range dependencies.
We propose a new architecture, the Decay RNN, which incorporates the decaying nature of neuronal activations.
arXiv Detail & Related papers (2020-05-17T09:13:28Z) - Neural Additive Models: Interpretable Machine Learning with Neural Nets [77.66871378302774]
Deep neural networks (DNNs) are powerful black-box predictors that have achieved impressive performance on a wide variety of tasks.
We propose Neural Additive Models (NAMs) which combine some of the expressivity of DNNs with the inherent intelligibility of generalized additive models.
NAMs learn a linear combination of neural networks that each attend to a single input feature.
arXiv Detail & Related papers (2020-04-29T01:28:32Z) - Rectified Linear Postsynaptic Potential Function for Backpropagation in
Deep Spiking Neural Networks [55.0627904986664]
Spiking Neural Networks (SNNs) usetemporal spike patterns to represent and transmit information, which is not only biologically realistic but also suitable for ultra-low-power event-driven neuromorphic implementation.
This paper investigates the contribution of spike timing dynamics to information encoding, synaptic plasticity and decision making, providing a new perspective to design of future DeepSNNs and neuromorphic hardware systems.
arXiv Detail & Related papers (2020-03-26T11:13:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.