Large Language Models have Intrinsic Self-Correction Ability
- URL: http://arxiv.org/abs/2406.15673v1
- Date: Fri, 21 Jun 2024 22:29:40 GMT
- Title: Large Language Models have Intrinsic Self-Correction Ability
- Authors: Dancheng Liu, Amir Nassereldine, Ziming Yang, Chenhui Xu, Yuting Hu, Jiajie Li, Utkarsh Kumar, Changjae Lee, Jinjun Xiong,
- Abstract summary: Large language models suffer from hallucinations that will cause performance degradation.
One promising solution to improve the LLMs' performance is to ask LLMs to revise their answer after generation.
In intrinsic self-correction is considered a promising direction because it does not utilize external knowledge.
- Score: 16.831123666582755
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large language models (LLMs) have attracted significant attention for their remarkable abilities in various natural language processing tasks, but they suffer from hallucinations that will cause performance degradation. One promising solution to improve the LLMs' performance is to ask LLMs to revise their answer after generation, a technique known as self-correction. Among the two types of self-correction, intrinsic self-correction is considered a promising direction because it does not utilize external knowledge. However, recent works doubt the validity of LLM's ability to conduct intrinsic self-correction. In this paper, we present a novel perspective on the intrinsic self-correction capabilities of LLMs through theoretical analyses and empirical experiments. In addition, we identify two critical factors for successful self-correction: zero temperature and fair prompts. Leveraging these factors, we demonstrate that intrinsic self-correction ability is exhibited across multiple existing LLMs. Our findings offer insights into the fundamental theories underlying the self-correction behavior of LLMs and remark on the importance of unbiased prompts and zero temperature settings in harnessing their full potential.
Related papers
- Is Moral Self-correction An Innate Capability of Large Language Models? A Mechanistic Analysis to Self-correction [7.077348519490594]
We aim to answer two fundamental questions for moral self-correction.
We examine how different self-correction components interact to intervene the embedded morality within hidden states.
We propose a validation framework, self-distinguish, that requires effective self-correction.
arXiv Detail & Related papers (2024-10-27T16:52:21Z) - Self-Correction is More than Refinement: A Learning Framework for Visual and Language Reasoning Tasks [43.96835245022083]
Self-correction that instructs models to refine their outputs presents a promising solution to this issue.
This study investigates the self-correction capabilities of Vision-Language Models during both inference and fine-tuning stages.
arXiv Detail & Related papers (2024-10-05T06:28:54Z) - A Theoretical Understanding of Self-Correction through In-context Alignment [51.622068973630796]
Large language models (LLMs) are capable of improving their abilities purely by self-correction.
We show that when LLMs give relatively accurate self-examinations as rewards, they are capable of refining responses in an in-context way.
Inspired by these findings, we also illustrate applications of self-correction, such as defending against LLM jailbreaks.
arXiv Detail & Related papers (2024-05-28T22:33:02Z) - Small Language Models Need Strong Verifiers to Self-Correct Reasoning [69.94251699982388]
Self-correction has emerged as a promising solution to boost the reasoning performance of large language models (LLMs)
This work explores whether small (= 13B) language models (LMs) have the ability of self-correction on reasoning tasks with minimal inputs from stronger LMs.
arXiv Detail & Related papers (2024-04-26T03:41:28Z) - Confidence Matters: Revisiting Intrinsic Self-Correction Capabilities of Large Language Models [23.42725642076256]
Large Language Models (LLMs) have catalyzed an increasing interest in their self-correction capabilities.
This paper presents a comprehensive investigation into the intrinsic self-correction of LLMs.
We develop an "If-or-Else" (IoE) prompting framework, designed to guide LLMs in assessing their own "confidence"
arXiv Detail & Related papers (2024-02-19T21:38:02Z) - Self-Alignment for Factuality: Mitigating Hallucinations in LLMs via Self-Evaluation [71.91287418249688]
Large language models (LLMs) often struggle with factual inaccuracies, even when they hold relevant knowledge.
We leverage the self-evaluation capability of an LLM to provide training signals that steer the model towards factuality.
We show that the proposed self-alignment approach substantially enhances factual accuracy over Llama family models across three key knowledge-intensive tasks.
arXiv Detail & Related papers (2024-02-14T15:52:42Z) - The ART of LLM Refinement: Ask, Refine, and Trust [85.75059530612882]
We propose a reasoning with refinement objective called ART: Ask, Refine, and Trust.
It asks necessary questions to decide when an LLM should refine its output.
It achieves a performance gain of +5 points over self-refinement baselines.
arXiv Detail & Related papers (2023-11-14T07:26:32Z) - Large Language Models Cannot Self-Correct Reasoning Yet [78.16697476530994]
Large Language Models (LLMs) have emerged as a groundbreaking technology with their unparalleled text generation capabilities.
Concerns persist regarding the accuracy and appropriateness of their generated content.
A contemporary methodology, self-correction, has been proposed as a remedy to these issues.
arXiv Detail & Related papers (2023-10-03T04:56:12Z) - Are Large Language Models Really Robust to Word-Level Perturbations? [68.60618778027694]
We propose a novel rational evaluation approach that leverages pre-trained reward models as diagnostic tools.
Longer conversations manifest the comprehensive grasp of language models in terms of their proficiency in understanding questions.
Our results demonstrate that LLMs frequently exhibit vulnerability to word-level perturbations that are commonplace in daily language usage.
arXiv Detail & Related papers (2023-09-20T09:23:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.