Related papers: Does entity abstraction help generative Transformers reason?

Does entity abstraction help generative Transformers reason?

URL: http://arxiv.org/abs/2201.01787v1
Date: Wed, 5 Jan 2022 19:00:53 GMT
Title: Does entity abstraction help generative Transformers reason?
Authors: Nicolas Gontier, Siva Reddy, Christopher Pal
Abstract summary: We study the utility of incorporating entity type abstractions into pre-trained Transformers. We test these methods on four NLP tasks requiring different forms of logical reasoning.
Score: 8.159805544989359
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Pre-trained language models (LMs) often struggle to reason logically or generalize in a compositional fashion. Recent work suggests that incorporating external entity knowledge can improve LMs' abilities to reason and generalize. However, the effect of explicitly providing entity abstraction remains unclear, especially with recent studies suggesting that pre-trained LMs already encode some of that knowledge in their parameters. We study the utility of incorporating entity type abstractions into pre-trained Transformers and test these methods on four NLP tasks requiring different forms of logical reasoning: (1) compositional language understanding with text-based relational reasoning (CLUTRR), (2) abductive reasoning (ProofWriter), (3) multi-hop question answering (HotpotQA), and (4) conversational question answering (CoQA). We propose and empirically explore three ways to add such abstraction: (i) as additional input embeddings, (ii) as a separate sequence to encode, and (iii) as an auxiliary prediction task for the model. Overall, our analysis demonstrates that models with abstract entity knowledge performs better than without it. However, our experiments also show that the benefits strongly depend on the technique used and the task at hand. The best abstraction aware models achieved an overall accuracy of 88.8% and 91.8% compared to the baseline model achieving 62.3% and 89.8% on CLUTRR and ProofWriter respectively. In addition, abstraction-aware models showed improved compositional generalization in both interpolation and extrapolation settings. However, for HotpotQA and CoQA, we find that F1 scores improve by only 0.5% on average. Our results suggest that the benefit of explicit abstraction is significant in formally defined logical reasoning settings requiring many reasoning hops, but point to the notion that it is less beneficial for NLP tasks having less formal logical structure.

Related papers

Can LLMs Reason Abstractly Over Math Word Problems Without CoT? Disentangling Abstract Formulation From Arithmetic Computation [39.50584137093401]
Final-answer-based metrics are commonly used for evaluating large language models (LLMs) on math word problems.<n>We show that the final-answer accuracy of Llama-3 and Qwen2.5 (1B-32B) without CoT is overwhelmingly bottlenecked by the arithmetic computation step.<n>Mechanistically, we show that these two skills are composed conjunctively even in a single forward pass without any reasoning steps.
arXiv Detail & Related papers (2025-05-29T17:37:57Z)
Implicit Reasoning in Transformers is Reasoning through Shortcuts [10.351525484558376]
Test-time compute is emerging as a new paradigm for enhancing language models' complex multi-step reasoning capabilities. We investigate how language models perform implicit reasoning in multi-step tasks.
arXiv Detail & Related papers (2025-03-10T17:58:31Z)
The Jumping Reasoning Curve? Tracking the Evolution of Reasoning Performance in GPT-[n] and o-[n] Models on Multimodal Puzzles [29.214813685163218]
OpenAI's releases of o1 and o3 mark a paradigm shift in Large Language Models towards advanced reasoning capabilities. We track the evolution of the GPT-[n] and o-[n] series models on challenging multimodal puzzles. The superior performance of o1 comes at nearly 750 times the computational cost of GPT-4o, raising concerns about its efficiency.
arXiv Detail & Related papers (2025-02-03T05:47:04Z)
Self-supervised Analogical Learning using Language Models [59.64260218737556]
We propose SAL, a self-supervised analogical learning framework. SAL mimics the human analogy process and trains models to explicitly transfer high-quality symbolic solutions. We show that the resulting models outperform base language models on a wide range of reasoning benchmarks.
arXiv Detail & Related papers (2025-02-03T02:31:26Z)
LINC: A Neurosymbolic Approach for Logical Reasoning by Combining Language Models with First-Order Logic Provers [60.009969929857704]
Logical reasoning is an important task for artificial intelligence with potential impacts on science, mathematics, and society. In this work, we reformulating such tasks as modular neurosymbolic programming, which we call LINC. We observe significant performance gains on FOLIO and a balanced subset of ProofWriter for three different models in nearly all experimental conditions we evaluate.
arXiv Detail & Related papers (2023-10-23T17:58:40Z)
Interpretability at Scale: Identifying Causal Mechanisms in Alpaca [62.65877150123775]
We use Boundless DAS to efficiently search for interpretable causal structure in large language models while they follow instructions. Our findings mark a first step toward faithfully understanding the inner-workings of our ever-growing and most widely deployed language models.
arXiv Detail & Related papers (2023-05-15T17:15:40Z)
Mind Reasoning Manners: Enhancing Type Perception for Generalized Zero-shot Logical Reasoning over Text [12.988062333041398]
We propose a new benchmark for generalized zero-shot logical reasoning, named ZsLR. For problem 1, we propose a new benchmark for generalized zero-shot logical reasoning, named ZsLR. For problem 2, a type-aware model TaCo is proposed to improve the type perception in the global representation.
arXiv Detail & Related papers (2023-01-08T05:24:34Z)
APOLLO: A Simple Approach for Adaptive Pretraining of Language Models for Logical Reasoning [73.3035118224719]
We propose APOLLO, an adaptively pretrained language model that has improved logical reasoning abilities. APOLLO performs comparably on ReClor and outperforms baselines on LogiQA.
arXiv Detail & Related papers (2022-12-19T07:40:02Z)
The Unreliability of Explanations in Few-Shot In-Context Learning [50.77996380021221]
We focus on two NLP tasks that involve reasoning over text, namely question answering and natural language inference. We show that explanations judged as good by humans--those that are logically consistent with the input--usually indicate more accurate predictions. We present a framework for calibrating model predictions based on the reliability of the explanations.
arXiv Detail & Related papers (2022-05-06T17:57:58Z)
Modeling Multi-Granularity Hierarchical Features for Relation Extraction [26.852869800344813]
We propose a novel method to extract multi-granularity features based solely on the original input sentences. We show that effective structured features can be attained even without external knowledge.
arXiv Detail & Related papers (2022-04-09T09:44:05Z)
Zero-shot Commonsense Question Answering with Cloze Translation and Consistency Optimization [20.14487209460865]
We investigate four translation methods that can translate natural questions into cloze-style sentences. We show that our methods are complementary datasets to a knowledge base improved model, and combining them can lead to state-of-the-art zero-shot performance.
arXiv Detail & Related papers (2022-01-01T07:12:49Z)
A Simple Approach to Case-Based Reasoning in Knowledge Bases [56.661396189466664]
We present a surprisingly simple yet accurate approach to reasoning in knowledge graphs (KGs) that requires emphno training, and is reminiscent of case-based reasoning in classical artificial intelligence (AI) Consider the task of finding a target entity given a source entity and a binary relation. Our non-parametric approach derives crisp logical rules for each query by finding multiple textitgraph path patterns that connect similar source entities through the given relation.
arXiv Detail & Related papers (2020-06-25T06:28:09Z)
Leap-Of-Thought: Teaching Pre-Trained Models to Systematically Reason Over Implicit Knowledge [96.92252296244233]
Large pre-trained language models (LMs) acquire some reasoning capacity, but this ability is difficult to control. We show that LMs can be trained to reliably perform systematic reasoning combining both implicit, pre-trained knowledge and explicit natural language statements. Our work paves a path towards open-domain systems that constantly improve by interacting with users who can instantly correct a model by adding simple natural language statements.
arXiv Detail & Related papers (2020-06-11T17:02:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.