Learning Reasoning Strategies in End-to-End Differentiable Proving
- URL: http://arxiv.org/abs/2007.06477v3
- Date: Mon, 24 Aug 2020 16:17:34 GMT
- Title: Learning Reasoning Strategies in End-to-End Differentiable Proving
- Authors: Pasquale Minervini, Sebastian Riedel, Pontus Stenetorp, Edward
Grefenstette, Tim Rockt\"aschel
- Abstract summary: Conditional Theorem Provers learn optimal rule selection strategy via gradient-based optimisation.
We show that Conditional Theorem Provers are scalable and yield state-of-the-art results on the CLUTRR dataset.
- Score: 50.9791149533921
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Attempts to render deep learning models interpretable, data-efficient, and
robust have seen some success through hybridisation with rule-based systems,
for example, in Neural Theorem Provers (NTPs). These neuro-symbolic models can
induce interpretable rules and learn representations from data via
back-propagation, while providing logical explanations for their predictions.
However, they are restricted by their computational complexity, as they need to
consider all possible proof paths for explaining a goal, thus rendering them
unfit for large-scale applications. We present Conditional Theorem Provers
(CTPs), an extension to NTPs that learns an optimal rule selection strategy via
gradient-based optimisation. We show that CTPs are scalable and yield
state-of-the-art results on the CLUTRR dataset, which tests systematic
generalisation of neural models by learning to reason over smaller graphs and
evaluating on larger ones. Finally, CTPs show better link prediction results on
standard benchmarks in comparison with other neural-symbolic models, while
being explainable. All source code and datasets are available online, at
https://github.com/uclnlp/ctp.
Related papers
- DFA-GNN: Forward Learning of Graph Neural Networks by Direct Feedback Alignment [57.62885438406724]
Graph neural networks are recognized for their strong performance across various applications.
BP has limitations that challenge its biological plausibility and affect the efficiency, scalability and parallelism of training neural networks for graph-based tasks.
We propose DFA-GNN, a novel forward learning framework tailored for GNNs with a case study of semi-supervised learning.
arXiv Detail & Related papers (2024-06-04T07:24:51Z) - Learning Low Dimensional State Spaces with Overparameterized Recurrent
Neural Nets [57.06026574261203]
We provide theoretical evidence for learning low-dimensional state spaces, which can also model long-term memory.
Experiments corroborate our theory, demonstrating extrapolation via learning low-dimensional state spaces with both linear and non-linear RNNs.
arXiv Detail & Related papers (2022-10-25T14:45:15Z) - From Spectral Graph Convolutions to Large Scale Graph Convolutional
Networks [0.0]
Graph Convolutional Networks (GCNs) have been shown to be a powerful concept that has been successfully applied to a large variety of tasks.
We study the theory that paved the way to the definition of GCN, including related parts of classical graph theory.
arXiv Detail & Related papers (2022-07-12T16:57:08Z) - Discovering Invariant Rationales for Graph Neural Networks [104.61908788639052]
Intrinsic interpretability of graph neural networks (GNNs) is to find a small subset of the input graph's features.
We propose a new strategy of discovering invariant rationale (DIR) to construct intrinsically interpretable GNNs.
arXiv Detail & Related papers (2022-01-30T16:43:40Z) - Scaling Structured Inference with Randomization [64.18063627155128]
We propose a family of dynamic programming (RDP) randomized for scaling structured models to tens of thousands of latent states.
Our method is widely applicable to classical DP-based inference.
It is also compatible with automatic differentiation so can be integrated with neural networks seamlessly.
arXiv Detail & Related papers (2021-12-07T11:26:41Z) - Tackling Oversmoothing of GNNs with Contrastive Learning [35.88575306925201]
Graph neural networks (GNNs) integrate the comprehensive relation of graph data and representation learning capability.
Oversmoothing makes the final representations of nodes indiscriminative, thus deteriorating the node classification and link prediction performance.
We propose the Topology-guided Graph Contrastive Layer, named TGCL, which is the first de-oversmoothing method maintaining all three mentioned metrics.
arXiv Detail & Related papers (2021-10-26T15:56:16Z) - Modeling Item Response Theory with Stochastic Variational Inference [8.369065078321215]
We introduce a variational Bayesian inference algorithm for Item Response Theory (IRT)
Applying this method to five large-scale item response datasets yields higher log likelihoods and higher accuracy in imputing missing data.
The algorithm implementation is open-source, and easily usable.
arXiv Detail & Related papers (2021-08-26T05:00:27Z) - Explaining and Improving Model Behavior with k Nearest Neighbor
Representations [107.24850861390196]
We propose using k nearest neighbor representations to identify training examples responsible for a model's predictions.
We show that kNN representations are effective at uncovering learned spurious associations.
Our results indicate that the kNN approach makes the finetuned model more robust to adversarial inputs.
arXiv Detail & Related papers (2020-10-18T16:55:25Z) - Implicit Graph Neural Networks [46.0589136729616]
We propose a graph learning framework called Implicit Graph Neural Networks (IGNN)
IGNNs consistently capture long-range dependencies and outperform state-of-the-art GNN models.
arXiv Detail & Related papers (2020-09-14T06:04:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.