Related papers: Transformers as Soft Reasoners over Language

Transformers as Soft Reasoners over Language

URL: http://arxiv.org/abs/2002.05867v2
Date: Tue, 5 May 2020 17:33:38 GMT
Title: Transformers as Soft Reasoners over Language
Authors: Peter Clark, Oyvind Tafjord, Kyle Richardson
Abstract summary: This paper investigates a problem where the facts and rules are provided as natural language sentences, thus bypassing a formal representation. We train transformers to emulate reason (or reasoning) over these sentences using synthetically generated data. Our models, that we call RuleTakers, provide the first empirical demonstration that this kind of soft reasoning over language is learnable.
Score: 33.291806251021185
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Beginning with McCarthy's Advice Taker (1959), AI has pursued the goal of providing a system with explicit, general knowledge and having the system reason over that knowledge. However, expressing the knowledge in a formal (logical or probabilistic) representation has been a major obstacle to this research. This paper investigates a modern approach to this problem where the facts and rules are provided as natural language sentences, thus bypassing a formal representation. We train transformers to reason (or emulate reasoning) over these sentences using synthetically generated data. Our models, that we call RuleTakers, provide the first empirical demonstration that this kind of soft reasoning over language is learnable, can achieve high (99%) accuracy, and generalizes to test data requiring substantially deeper chaining than seen during training (95%+ scores). We also demonstrate that the models transfer well to two hand-authored rulebases, and to rulebases paraphrased into more natural language. These findings are significant as it suggests a new role for transformers, namely as limited "soft theorem provers" operating over explicit theories in language. This in turn suggests new possibilities for explainability, correctability, and counterfactual reasoning in question-answering.

Related papers

Implicit Reasoning in Transformers is Reasoning through Shortcuts [10.351525484558376]
Test-time compute is emerging as a new paradigm for enhancing language models' complex multi-step reasoning capabilities. We investigate how language models perform implicit reasoning in multi-step tasks.
arXiv Detail & Related papers (2025-03-10T17:58:31Z)
A Multi-Task Text Classification Pipeline with Natural Language Explanations: A User-Centric Evaluation in Sentiment Analysis and Offensive Language Identification in Greek Tweets [8.846643533783205]
This work introduces an early concept for a novel pipeline that can be used in text classification tasks. It comprises of two models: a classifier for labelling the text and an explanation generator which provides the explanation. Experiments are centred around the tasks of sentiment analysis and offensive language identification in Greek tweets.
arXiv Detail & Related papers (2024-10-14T08:41:31Z)
Implicit Chain of Thought Reasoning via Knowledge Distillation [58.80851216530288]
Instead of explicitly producing the chain of thought reasoning steps, we use the language model's internal hidden states to perform implicit reasoning. We find that this approach enables solving tasks previously not solvable without explicit chain-of-thought, at a speed comparable to no chain-of-thought.
arXiv Detail & Related papers (2023-11-02T17:59:49Z)
Learning Language Representations with Logical Inductive Bias [19.842271716111153]
We explore a new logical inductive bias for better language representation learning. We develop a novel neural architecture named FOLNet to encode this new inductive bias. We find that the self-attention module in transformers can be composed by two of our neural logic operators.
arXiv Detail & Related papers (2023-02-19T02:21:32Z)
Language Models as Inductive Reasoners [125.99461874008703]
We propose a new paradigm (task) for inductive reasoning, which is to induce natural language rules from natural language facts. We create a dataset termed DEER containing 1.2k rule-fact pairs for the task, where rules and facts are written in natural language. We provide the first and comprehensive analysis of how well pretrained language models can induce natural language rules from natural language facts.
arXiv Detail & Related papers (2022-12-21T11:12:14Z)
AbductionRules: Training Transformers to Explain Unexpected Inputs [2.2630663834223763]
We present AbductionRules, a group of datasets designed to train and test generalisable abduction over natural-language knowledge bases. We discuss the viability of this approach to abductive reasoning and ways in which it may be improved in future work.
arXiv Detail & Related papers (2022-03-23T04:18:30Z)
Learning Symbolic Rules for Reasoning in Quasi-Natural Language [74.96601852906328]
We build a rule-based system that can reason with natural language input but without the manual construction of rules. We propose MetaQNL, a "Quasi-Natural" language that can express both formal logic and natural language sentences. Our approach achieves state-of-the-art accuracy on multiple reasoning benchmarks.
arXiv Detail & Related papers (2021-11-23T17:49:00Z)
Neural Unification for Logic Reasoning over Natural Language [0.28675177318965034]
Automated Theorem Proving deals with the development of computer programs being able to show that some conjectures (queries) are a logical consequence of a set of axioms (facts and rules) Recent approaches have proposed transformer-based architectures for deriving conjectures given axioms expressed in natural language (English) In this work we propose a new architecture, namely the Neural Unifier, which achieves state-of-the-art results in term of generalisation.
arXiv Detail & Related papers (2021-09-17T10:48:39Z)
Fact-driven Logical Reasoning for Machine Reading Comprehension [82.58857437343974]
We are motivated to cover both commonsense and temporary knowledge clues hierarchically. Specifically, we propose a general formalism of knowledge units by extracting backbone constituents of the sentence. We then construct a supergraph on top of the fact units, allowing for the benefit of sentence-level (relations among fact groups) and entity-level interactions.
arXiv Detail & Related papers (2021-05-21T13:11:13Z)
Provable Limitations of Acquiring Meaning from Ungrounded Form: What will Future Language Models Understand? [87.20342701232869]
We investigate the abilities of ungrounded systems to acquire meaning. We study whether assertions enable a system to emulate representations preserving semantic relations like equivalence. We find that assertions enable semantic emulation if all expressions in the language are referentially transparent. However, if the language uses non-transparent patterns like variable binding, we show that emulation can become an uncomputable problem.
arXiv Detail & Related papers (2021-04-22T01:00:17Z)
Leap-Of-Thought: Teaching Pre-Trained Models to Systematically Reason Over Implicit Knowledge [96.92252296244233]
Large pre-trained language models (LMs) acquire some reasoning capacity, but this ability is difficult to control. We show that LMs can be trained to reliably perform systematic reasoning combining both implicit, pre-trained knowledge and explicit natural language statements. Our work paves a path towards open-domain systems that constantly improve by interacting with users who can instantly correct a model by adding simple natural language statements.
arXiv Detail & Related papers (2020-06-11T17:02:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.