Verify-and-Edit: A Knowledge-Enhanced Chain-of-Thought Framework
- URL: http://arxiv.org/abs/2305.03268v1
- Date: Fri, 5 May 2023 03:49:14 GMT
- Title: Verify-and-Edit: A Knowledge-Enhanced Chain-of-Thought Framework
- Authors: Ruochen Zhao, Xingxuan Li, Shafiq Joty, Chengwei Qin, Lidong Bing
- Abstract summary: Large language models (LLMs) have become the norm in NLP, demonstrating good performance in generation and reasoning tasks.
One of its most fatal disadvantages is the lack of factual correctness.
Generating unfactual texts not only leads to lower performances but also degrades the trust and validity of their applications.
- Score: 26.7264686036634
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: As large language models (LLMs) have become the norm in NLP, demonstrating
good performance in generation and reasoning tasks, one of its most fatal
disadvantages is the lack of factual correctness. Generating unfactual texts
not only leads to lower performances but also degrades the trust and validity
of their applications. Chain-of-Thought (CoT) prompting improves trust and
model performance on complex reasoning tasks by generating interpretable
reasoning chains, but still suffers from factuality concerns in
knowledge-intensive tasks. In this paper, we propose the Verify-and-Edit
framework for CoT prompting, which seeks to increase prediction factuality by
post-editing reasoning chains according to external knowledge. Building on top
of GPT-3, our framework lead to accuracy improvements in multiple open-domain
question-answering tasks.
Related papers
- ReVISE: Learning to Refine at Test-Time via Intrinsic Self-Verification [53.80183105328448]
Refine via Intrinsic Self-Verification (ReVISE) is an efficient framework that enables LLMs to self-correct their outputs through self-verification.
Our experiments on various reasoning tasks demonstrate that ReVISE achieves efficient self-correction and significantly improves reasoning performance.
arXiv Detail & Related papers (2025-02-20T13:50:02Z) - STRIVE: Structured Reasoning for Self-Improvement in Claim Verification [21.00145637520767]
We propose STRIVE: Structured Reasoning for Self-Improved Verification.
Our method introduces a structured reasoning design with Claim Decomposition, Entity Analysis, and Evidence Grounding Verification.
It is then applied to generate reasoning chains for all training examples, selecting only those that are correct and structurally sound for subsequent self-improvement training.
arXiv Detail & Related papers (2025-02-17T16:07:07Z) - Aligning Large Language Models for Faithful Integrity Against Opposing Argument [71.33552795870544]
Large Language Models (LLMs) have demonstrated impressive capabilities in complex reasoning tasks.
They can be easily misled by unfaithful arguments during conversations, even when their original statements are correct.
We propose a novel framework, named Alignment for Faithful Integrity with Confidence Estimation.
arXiv Detail & Related papers (2025-01-02T16:38:21Z) - Retrieving, Rethinking and Revising: The Chain-of-Verification Can Improve Retrieval Augmented Generation [38.80878966092216]
Recent Retrieval Augmented Generation (RAG) aims to enhance Large Language Models (LLMs)
We propose the chain-of-verification (CoV-RAG) to enhance the external retrieval correctness and internal generation consistency.
arXiv Detail & Related papers (2024-10-08T08:34:54Z) - Improving LLM Reasoning through Scaling Inference Computation with Collaborative Verification [52.095460362197336]
Large language models (LLMs) struggle with consistent and accurate reasoning.
LLMs are trained primarily on correct solutions, reducing their ability to detect and learn from errors.
We propose a novel collaborative method integrating Chain-of-Thought (CoT) and Program-of-Thought (PoT) solutions for verification.
arXiv Detail & Related papers (2024-10-05T05:21:48Z) - TRACE: TRansformer-based Attribution using Contrastive Embeddings in LLMs [50.259001311894295]
We propose a novel TRansformer-based Attribution framework using Contrastive Embeddings called TRACE.
We show that TRACE significantly improves the ability to attribute sources accurately, making it a valuable tool for enhancing the reliability and trustworthiness of large language models.
arXiv Detail & Related papers (2024-07-06T07:19:30Z) - Igniting Language Intelligence: The Hitchhiker's Guide From
Chain-of-Thought Reasoning to Language Agents [80.5213198675411]
Large language models (LLMs) have dramatically enhanced the field of language intelligence.
LLMs leverage the intriguing chain-of-thought (CoT) reasoning techniques, obliging them to formulate intermediate steps en route to deriving an answer.
Recent research endeavors have extended CoT reasoning methodologies to nurture the development of autonomous language agents.
arXiv Detail & Related papers (2023-11-20T14:30:55Z) - Ladder-of-Thought: Using Knowledge as Steps to Elevate Stance Detection [73.31406286956535]
We introduce the Ladder-of-Thought (LoT) for the stance detection task.
LoT directs the small LMs to assimilate high-quality external knowledge, refining the intermediate rationales produced.
Our empirical evaluations underscore LoT's efficacy, marking a 16% improvement over GPT-3.5 and a 10% enhancement compared to GPT-3.5 with CoT on stance detection task.
arXiv Detail & Related papers (2023-08-31T14:31:48Z) - Question Decomposition Improves the Faithfulness of Model-Generated
Reasoning [23.34325378824462]
Large language models (LLMs) are difficult to verify the correctness and safety of their behavior.
One approach is to prompt LLMs to externalize their reasoning, by having them generate step-by-step reasoning as they answer a question.
This approach relies on the stated reasoning faithfully reflecting the model's actual reasoning, which is not always the case.
Decomposition-based methods achieve strong performance on question-answering tasks, sometimes approaching that of CoT.
arXiv Detail & Related papers (2023-07-17T00:54:10Z) - Boosting Language Models Reasoning with Chain-of-Knowledge Prompting [18.326858925174605]
Chain-of-Knowledge (CoK) prompting aims at eliciting explicit pieces of knowledge evidence in the form of structure triple.
Benefiting from CoK, we additionally introduce a F2-Verification method to estimate the reliability of the reasoning chains.
Extensive experiments demonstrate that our method can further improve the performance of commonsense, factual, symbolic, and arithmetic reasoning tasks.
arXiv Detail & Related papers (2023-06-10T12:42:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.