Does entity abstraction help generative Transformers reason?
- URL: http://arxiv.org/abs/2201.01787v1
- Date: Wed, 5 Jan 2022 19:00:53 GMT
- Title: Does entity abstraction help generative Transformers reason?
- Authors: Nicolas Gontier, Siva Reddy, Christopher Pal
- Abstract summary: We study the utility of incorporating entity type abstractions into pre-trained Transformers.
We test these methods on four NLP tasks requiring different forms of logical reasoning.
- Score: 8.159805544989359
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Pre-trained language models (LMs) often struggle to reason logically or
generalize in a compositional fashion. Recent work suggests that incorporating
external entity knowledge can improve LMs' abilities to reason and generalize.
However, the effect of explicitly providing entity abstraction remains unclear,
especially with recent studies suggesting that pre-trained LMs already encode
some of that knowledge in their parameters. We study the utility of
incorporating entity type abstractions into pre-trained Transformers and test
these methods on four NLP tasks requiring different forms of logical reasoning:
(1) compositional language understanding with text-based relational reasoning
(CLUTRR), (2) abductive reasoning (ProofWriter), (3) multi-hop question
answering (HotpotQA), and (4) conversational question answering (CoQA). We
propose and empirically explore three ways to add such abstraction: (i) as
additional input embeddings, (ii) as a separate sequence to encode, and (iii)
as an auxiliary prediction task for the model. Overall, our analysis
demonstrates that models with abstract entity knowledge performs better than
without it. However, our experiments also show that the benefits strongly
depend on the technique used and the task at hand. The best abstraction aware
models achieved an overall accuracy of 88.8% and 91.8% compared to the baseline
model achieving 62.3% and 89.8% on CLUTRR and ProofWriter respectively. In
addition, abstraction-aware models showed improved compositional generalization
in both interpolation and extrapolation settings. However, for HotpotQA and
CoQA, we find that F1 scores improve by only 0.5% on average. Our results
suggest that the benefit of explicit abstraction is significant in formally
defined logical reasoning settings requiring many reasoning hops, but point to
the notion that it is less beneficial for NLP tasks having less formal logical
structure.
Related papers
- LINC: A Neurosymbolic Approach for Logical Reasoning by Combining
Language Models with First-Order Logic Provers [60.009969929857704]
Logical reasoning is an important task for artificial intelligence with potential impacts on science, mathematics, and society.
In this work, we reformulating such tasks as modular neurosymbolic programming, which we call LINC.
We observe significant performance gains on FOLIO and a balanced subset of ProofWriter for three different models in nearly all experimental conditions we evaluate.
arXiv Detail & Related papers (2023-10-23T17:58:40Z) - Interpretability at Scale: Identifying Causal Mechanisms in Alpaca [62.65877150123775]
We use Boundless DAS to efficiently search for interpretable causal structure in large language models while they follow instructions.
Our findings mark a first step toward faithfully understanding the inner-workings of our ever-growing and most widely deployed language models.
arXiv Detail & Related papers (2023-05-15T17:15:40Z) - Mind Reasoning Manners: Enhancing Type Perception for Generalized
Zero-shot Logical Reasoning over Text [12.988062333041398]
We propose a new benchmark for generalized zero-shot logical reasoning, named ZsLR.
For problem 1, we propose a new benchmark for generalized zero-shot logical reasoning, named ZsLR.
For problem 2, a type-aware model TaCo is proposed to improve the type perception in the global representation.
arXiv Detail & Related papers (2023-01-08T05:24:34Z) - APOLLO: A Simple Approach for Adaptive Pretraining of Language Models
for Logical Reasoning [73.3035118224719]
We propose APOLLO, an adaptively pretrained language model that has improved logical reasoning abilities.
APOLLO performs comparably on ReClor and outperforms baselines on LogiQA.
arXiv Detail & Related papers (2022-12-19T07:40:02Z) - The Unreliability of Explanations in Few-Shot In-Context Learning [50.77996380021221]
We focus on two NLP tasks that involve reasoning over text, namely question answering and natural language inference.
We show that explanations judged as good by humans--those that are logically consistent with the input--usually indicate more accurate predictions.
We present a framework for calibrating model predictions based on the reliability of the explanations.
arXiv Detail & Related papers (2022-05-06T17:57:58Z) - Modeling Multi-Granularity Hierarchical Features for Relation Extraction [26.852869800344813]
We propose a novel method to extract multi-granularity features based solely on the original input sentences.
We show that effective structured features can be attained even without external knowledge.
arXiv Detail & Related papers (2022-04-09T09:44:05Z) - Zero-shot Commonsense Question Answering with Cloze Translation and
Consistency Optimization [20.14487209460865]
We investigate four translation methods that can translate natural questions into cloze-style sentences.
We show that our methods are complementary datasets to a knowledge base improved model, and combining them can lead to state-of-the-art zero-shot performance.
arXiv Detail & Related papers (2022-01-01T07:12:49Z) - A Simple Approach to Case-Based Reasoning in Knowledge Bases [56.661396189466664]
We present a surprisingly simple yet accurate approach to reasoning in knowledge graphs (KGs) that requires emphno training, and is reminiscent of case-based reasoning in classical artificial intelligence (AI)
Consider the task of finding a target entity given a source entity and a binary relation.
Our non-parametric approach derives crisp logical rules for each query by finding multiple textitgraph path patterns that connect similar source entities through the given relation.
arXiv Detail & Related papers (2020-06-25T06:28:09Z) - Leap-Of-Thought: Teaching Pre-Trained Models to Systematically Reason
Over Implicit Knowledge [96.92252296244233]
Large pre-trained language models (LMs) acquire some reasoning capacity, but this ability is difficult to control.
We show that LMs can be trained to reliably perform systematic reasoning combining both implicit, pre-trained knowledge and explicit natural language statements.
Our work paves a path towards open-domain systems that constantly improve by interacting with users who can instantly correct a model by adding simple natural language statements.
arXiv Detail & Related papers (2020-06-11T17:02:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.