Deductive Additivity for Planning of Natural Language Proofs
- URL: http://arxiv.org/abs/2307.02472v2
- Date: Thu, 6 Jul 2023 02:16:33 GMT
- Title: Deductive Additivity for Planning of Natural Language Proofs
- Authors: Zayne Sprague, Kaj Bostrom, Swarat Chaudhuri, Greg Durrett
- Abstract summary: We investigate whether an efficient planning is possible via embedding spaces compatible with deductive reasoning.
Our findings suggest that while standard embedding methods frequently embed conclusions near the sums of their premises, they fall short of being effectives and lack the ability to model certain categories of reasoning.
- Score: 43.93269297653265
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Current natural language systems designed for multi-step claim validation
typically operate in two phases: retrieve a set of relevant premise statements
using heuristics (planning), then generate novel conclusions from those
statements using a large language model (deduction). The planning step often
requires expensive Transformer operations and does not scale to arbitrary
numbers of premise statements. In this paper, we investigate whether an
efficient planning heuristic is possible via embedding spaces compatible with
deductive reasoning. Specifically, we evaluate whether embedding spaces exhibit
a property we call deductive additivity: the sum of premise statement
embeddings should be close to embeddings of conclusions based on those
premises. We explore multiple sources of off-the-shelf dense embeddings in
addition to fine-tuned embeddings from GPT3 and sparse embeddings from BM25. We
study embedding models both intrinsically, evaluating whether the property of
deductive additivity holds, and extrinsically, using them to assist planning in
natural language proof generation. Lastly, we create a dataset, Single-Step
Reasoning Contrast (SSRC), to further probe performance on various reasoning
types. Our findings suggest that while standard embedding methods frequently
embed conclusions near the sums of their premises, they fall short of being
effective heuristics and lack the ability to model certain categories of
reasoning.
Related papers
- CASA: Causality-driven Argument Sufficiency Assessment [79.13496878681309]
We propose CASA, a zero-shot causality-driven argument sufficiency assessment framework.
PS measures how likely introducing the premise event would lead to the conclusion when both the premise and conclusion events are absent.
Experiments on two logical fallacy detection datasets demonstrate that CASA accurately identifies insufficient arguments.
arXiv Detail & Related papers (2024-01-10T16:21:18Z) - Phenomenal Yet Puzzling: Testing Inductive Reasoning Capabilities of Language Models with Hypothesis Refinement [92.61557711360652]
Language models (LMs) often fall short on inductive reasoning, despite achieving impressive success on research benchmarks.
We conduct a systematic study of the inductive reasoning capabilities of LMs through iterative hypothesis refinement.
We reveal several discrepancies between the inductive reasoning processes of LMs and humans, shedding light on both the potentials and limitations of using LMs in inductive reasoning tasks.
arXiv Detail & Related papers (2023-10-12T17:51:10Z) - Hypothesis Search: Inductive Reasoning with Language Models [39.03846394586811]
Recent work evaluates large language models on inductive reasoning tasks by directly prompting them yielding "in context learning"
This works well for straightforward inductive tasks but performs poorly on complex tasks such as the Abstraction and Reasoning Corpus (ARC)
In this work, we propose to improve the inductive reasoning ability of LLMs by generating explicit hypotheses at multiple levels of abstraction.
arXiv Detail & Related papers (2023-09-11T17:56:57Z) - A Semantic Approach to Decidability in Epistemic Planning (Extended
Version) [72.77805489645604]
We use a novel semantic approach to achieve decidability.
Specifically, we augment the logic of knowledge S5$_n$ and with an interaction axiom called (knowledge) commutativity.
We prove that our framework admits a finitary non-fixpoint characterization of common knowledge, which is of independent interest.
arXiv Detail & Related papers (2023-07-28T11:26:26Z) - Simple Linguistic Inferences of Large Language Models (LLMs): Blind Spots and Blinds [59.71218039095155]
We evaluate language understanding capacities on simple inference tasks that most humans find trivial.
We target (i) grammatically-specified entailments, (ii) premises with evidential adverbs of uncertainty, and (iii) monotonicity entailments.
The models exhibit moderate to low performance on these evaluation sets.
arXiv Detail & Related papers (2023-05-24T06:41:09Z) - Abductive Commonsense Reasoning Exploiting Mutually Exclusive
Explanations [118.0818807474809]
Abductive reasoning aims to find plausible explanations for an event.
Existing approaches for abductive reasoning in natural language processing often rely on manually generated annotations for supervision.
This work proposes an approach for abductive commonsense reasoning that exploits the fact that only a subset of explanations is correct for a given context.
arXiv Detail & Related papers (2023-05-24T01:35:10Z) - Natural Language Deduction with Incomplete Information [43.93269297653265]
We propose a new system that can handle the underspecified setting where not all premises are stated at the outset.
By using a natural language generation model to abductively infer a premise given another premise and a conclusion, we can impute missing pieces of evidence needed for the conclusion to be true.
arXiv Detail & Related papers (2022-11-01T17:27:55Z) - Language Models Are Greedy Reasoners: A Systematic Formal Analysis of
Chain-of-Thought [10.524051272257614]
Large language models (LLMs) have shown remarkable reasoning capabilities given chain-of-thought prompts.
We present a new synthetic question-answering dataset called PrOntoQA, where each example is generated as a synthetic world model.
This allows us to parse the generated chain-of-thought into symbolic proofs for formal analysis.
arXiv Detail & Related papers (2022-10-03T21:34:32Z) - Natural Language Deduction through Search over Statement Compositions [43.93269297653265]
We propose a system for natural language deduction that decomposes the task into separate steps coordinated by best-first search.
Our experiments demonstrate that the proposed system can better distinguish verifiable hypotheses from unverifiable ones.
arXiv Detail & Related papers (2022-01-16T12:05:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.