Fugu-MT 論文翻訳(概要): RCOT: Detecting and Rectifying Factual Inconsistency in Reasoning by Reversing Chain-of-Thought

論文の概要: RCOT: Detecting and Rectifying Factual Inconsistency in Reasoning by Reversing Chain-of-Thought

arxiv url: http://arxiv.org/abs/2305.11499v1
Date: Fri, 19 May 2023 08:02:52 GMT
ステータス: 翻訳完了
システム内更新日: 2023-05-22 15:35:04.446028
Title: RCOT: Detecting and Rectifying Factual Inconsistency in Reasoning by Reversing Chain-of-Thought
Title（参考訳）: RCOT:思考の連鎖逆転による推論の不整合の検出と抑制
Authors: Tianci Xue, Ziqi Wang, Zhenhailong Wang, Chi Han, Pengfei Yu, Heng Ji
Abstract要約: 大規模言語モデル(LLM)は、ステップ・バイ・ステップ・チェーン・オブ・シークレット(CoT)プロンプトを組み込むことで、算術的推論タスクにおいて有望な性能を達成した。既存の手法では、粗いフィードバックを使って事実整合性を改善する。 RCoT(Reversing Chain-of-Thought)は,現実の不整合を自動的に検出・修正し,LLMの推論能力を向上させる手法である。
参考スコア（独自算出の注目度）: 46.016590978657995
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large language Models (LLMs) have achieved promising performance on arithmetic reasoning tasks by incorporating step-by-step chain-of-thought (CoT) prompting. However, LLMs face challenges in maintaining factual consistency during reasoning, exhibiting tendencies to condition overlooking, question misinterpretation, and condition hallucination over given problems. Existing methods use coarse-grained feedback (e.g., whether the answer is correct) to improve factual consistency. In this work, we propose RCoT (Reversing Chain-of-Thought), a novel method to improve LLMs' reasoning abilities by automatically detecting and rectifying factual inconsistency in LLMs' generated solutions. To detect factual inconsistency, RCoT first asks LLMs to reconstruct the problem based on generated solutions. Then fine-grained comparisons between the original problem and the reconstructed problem expose the factual inconsistency in the original solutions. To rectify the solution, RCoT formulates detected factual inconsistency into fine-grained feedback to guide LLMs in revising solutions. Experimental results demonstrate consistent improvements of RCoT over standard CoT across seven arithmetic datasets. Moreover, we find that manually written fine-grained feedback can dramatically improve LLMs' reasoning abilities (e.g., ChatGPT reaches 94.6% accuracy on GSM8K), encouraging the community to further explore the fine-grained feedback generation methods.
Abstract（参考訳）: 大規模言語モデル(LLM)は、ステップ・バイ・ステップ・チェーン・オブ・シークレット(CoT)プロンプトを導入し、算術推論タスクにおいて有望な性能を達成した。しかし、LLMは、推論中の事実整合性の維持、条件見落としの傾向、疑わしい解釈、与えられた問題に対する条件幻覚といった課題に直面している。既存の方法は、粗いフィードバック(例えば、答えが正しいかどうか)を使って、事実整合性を改善する。本研究では,LLMの生成したソリューションにおける現実的矛盾を自動的に検出し,修正することにより,LLMの推論能力を向上させる新しい手法であるRCoT(Reversing Chain-of-Thought)を提案する。事実整合性を検出するために、RCoT はまず LLM に対して、生成した解に基づいて問題を再構築するよう要求する。そして、元の問題と再構成された問題の細かな比較は、元の解の事実的矛盾を露呈する。解を正すために、RCoT式は、実際の矛盾を検出してきめ細かいフィードバックを与え、解の修正にLSMを導く。実験により、7つの算術データセット間で標準CoTよりも一貫した改善が示された。さらに、手書きのきめ細かいフィードバックは、LCMの推論能力(例えば、ChatGPTはGSM8Kで94.6%の精度に達する)を劇的に向上させ、よりきめ細かいフィードバック生成方法の探求を促している。

論文の概要: RCOT: Detecting and Rectifying Factual Inconsistency in Reasoning by Reversing Chain-of-Thought

関連論文リスト