Entailer: Answering Questions with Faithful and Truthful Chains of
Reasoning
- URL: http://arxiv.org/abs/2210.12217v1
- Date: Fri, 21 Oct 2022 19:51:56 GMT
- Title: Entailer: Answering Questions with Faithful and Truthful Chains of
Reasoning
- Authors: Oyvind Tafjord, Bhavana Dalvi Mishra, Peter Clark
- Abstract summary: We show how a question-answering system can show how its answers are implied by its own internal beliefs via a systematic chain of reasoning.
Our approach is to combine a trained backward-chaining model, capable of generating a set of premises entailing an answer hypothesis, with a verifier that checks that the model itself believes those premises.
To our knowledge, this is the first system to generate multistep chains that are both faithful (the answer follows from the reasoning) and truthful (the chain reflects the system's own internal beliefs)
- Score: 26.715242799194908
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Our goal is a question-answering (QA) system that can show how its answers
are implied by its own internal beliefs via a systematic chain of reasoning.
Such a capability would allow better understanding of why a model produced the
answer it did. Our approach is to recursively combine a trained
backward-chaining model, capable of generating a set of premises entailing an
answer hypothesis, with a verifier that checks that the model itself believes
those premises (and the entailment itself) through self-querying. To our
knowledge, this is the first system to generate multistep chains that are both
faithful (the answer follows from the reasoning) and truthful (the chain
reflects the system's own internal beliefs). In evaluation using two different
datasets, users judge that a majority (70%+) of generated chains clearly show
how an answer follows from a set of facts - substantially better than a
high-performance baseline - while preserving answer accuracy. By materializing
model beliefs that systematically support an answer, new opportunities arise
for understanding the model's system of belief, and diagnosing and correcting
its misunderstandings when an answer is wrong.
Related papers
- Chain-of-Probe: Examing the Necessity and Accuracy of CoT Step-by-Step [81.50681925980135]
We propose a method to probe changes in the mind during the model's reasoning.
By analyzing patterns in mind change, we examine the correctness of the model's reasoning.
Our validation reveals that many responses, although correct in their final answer, contain errors in their reasoning process.
arXiv Detail & Related papers (2024-06-23T15:50:22Z) - Probabilistic Tree-of-thought Reasoning for Answering
Knowledge-intensive Complex Questions [93.40614719648386]
Large language models (LLMs) are capable of answering knowledge-intensive complex questions with chain-of-thought (CoT) reasoning.
Recent works turn to retrieving external knowledge to augment CoT reasoning.
We propose a novel approach: Probabilistic Tree-of-thought Reasoning (ProbTree)
arXiv Detail & Related papers (2023-11-23T12:52:37Z) - Language Models with Rationality [57.37201135072838]
Large language models (LLMs) are proficient at question-answering (QA)
It is not always clear how (or even if) an answer follows from their latent "beliefs"
arXiv Detail & Related papers (2023-05-23T17:04:25Z) - Answering Questions by Meta-Reasoning over Multiple Chains of Thought [53.55653437903948]
We introduce Multi-Chain Reasoning (MCR), an approach which prompts large language models to meta-reason over multiple chains of thought.
MCR examines different reasoning chains, mixes information between them and selects the most relevant facts in generating an explanation and predicting the answer.
arXiv Detail & Related papers (2023-04-25T17:27:37Z) - ReCEval: Evaluating Reasoning Chains via Correctness and Informativeness [67.49087159888298]
ReCEval is a framework that evaluates reasoning chains via two key properties: correctness and informativeness.
We show that ReCEval effectively identifies various error types and yields notable improvements compared to prior methods.
arXiv Detail & Related papers (2023-04-21T02:19:06Z) - Measuring and Narrowing the Compositionality Gap in Language Models [116.5228850227024]
We measure how often models can correctly answer all sub-problems but not generate the overall solution.
We present a new method, self-ask, that further improves on chain of thought.
arXiv Detail & Related papers (2022-10-07T06:50:23Z) - Towards Teachable Reasoning Systems [29.59387051046722]
We develop a teachable reasoning system for question-answering (QA)
Our approach is three-fold: First, generated chains of reasoning show how answers are implied by the system's own internal beliefs.
Second, users can interact with the explanations to identify erroneous model beliefs and provide corrections.
Third, we augment the model with a dynamic memory of such corrections.
arXiv Detail & Related papers (2022-04-27T17:15:07Z) - Think about it! Improving defeasible reasoning by first modeling the
question scenario [35.6110036360506]
Defeasible reasoning is the mode of reasoning where conclusions can be overturned by taking into account new evidence.
Our research goal asks whether neural models can similarly benefit from envisioning the question scenario before answering a defeasible query.
Our system, CURIOUS, achieves a new state-of-the-art on three different defeasible reasoning datasets.
arXiv Detail & Related papers (2021-10-24T04:13:52Z) - Robustifying Multi-hop QA through Pseudo-Evidentiality Training [28.584236042324896]
We study the bias problem of multi-hop question answering models, of answering correctly without correct reasoning.
We propose a new approach to learn evidentiality, deciding whether the answer prediction is supported by correct evidences.
arXiv Detail & Related papers (2021-07-07T14:15:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.