Recursive Chain-of-Feedback Prevents Performance Degradation from
Redundant Prompting
- URL: http://arxiv.org/abs/2402.02648v2
- Date: Fri, 1 Mar 2024 10:46:01 GMT
- Title: Recursive Chain-of-Feedback Prevents Performance Degradation from
Redundant Prompting
- Authors: Jinwoo Ahn, Kyuseung Shin
- Abstract summary: This paper studies such repetitive behavior and its effect by defining a novel setting, Chain-of-Feedback (CoF)
To alleviate these troubles, we propose a novel method, Recursive Chain-of-Feedback (R-CoF)
- Score: 0.4662017507844857
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large Language Models (LLMs) frequently struggle with complex reasoning
tasks, failing to construct logically sound steps towards the solution. In
response to this behavior, users often try prompting the LLMs repeatedly in
hopes of reaching a better response. This paper studies such repetitive
behavior and its effect by defining a novel setting, Chain-of-Feedback (CoF).
The setting takes questions that require multi-step reasoning as an input. Upon
response, we repetitively prompt meaningless feedback (e.g. 'make another
attempt') requesting additional trials. Surprisingly, our preliminary results
show that repeated meaningless feedback gradually decreases the quality of the
responses, eventually leading to a larger deviation from the intended outcome.
To alleviate these troubles, we propose a novel method, Recursive
Chain-of-Feedback (R-CoF). Following the logic of recursion in computer
science, R-CoF recursively revises the initially incorrect response by breaking
down each incorrect reasoning step into smaller individual problems. Our
preliminary results show that majority of questions that LLMs fail to respond
correctly can be answered using R-CoF without any sample data outlining the
logical process.
Related papers
- Beyond the Last Answer: Your Reasoning Trace Uncovers More than You Think [51.0691253204425]
We analyze intermediate reasoning steps, termed subthoughts, to answer two questions: Does the final answer reliably represent the model's optimal conclusion?
Our approach involves segmenting a reasoning trace into sequential subthoughts based on linguistic cues.
We find that aggregating these answers by selecting the most frequent one (the mode) often yields significantly higher accuracy compared to relying solely on the answer derived from the original complete trace.
arXiv Detail & Related papers (2025-04-29T12:39:07Z) - Right Answer, Wrong Score: Uncovering the Inconsistencies of LLM Evaluation in Multiple-Choice Question Answering [78.89231943329885]
One of the most widely used tasks to evaluate Large Language Models (LLMs) is Multiple-Choice Question Answering (MCQA)
In this work, we shed light on the inconsistencies of MCQA evaluation strategies, which can lead to inaccurate and misleading model comparisons.
arXiv Detail & Related papers (2025-03-19T08:45:03Z) - Toward Adaptive Reasoning in Large Language Models with Thought Rollback [33.714789952452094]
This paper proposes a new reasoning framework, called Thought Rollback (TR)
TR allows large language models (LLMs) to adaptively build thought structure while maintaining effective reasoning toward problem-solving under hallucinations''
arXiv Detail & Related papers (2024-12-27T16:02:34Z) - Sequence to Sequence Reward Modeling: Improving RLHF by Language Feedback [8.601283886845664]
Reinforcement learning from human feedback (RLHF) aligns Large language models (LLMs) with human intentions and values.
Despite its effectiveness and popularity, RLHF is prone to biased local optimization.
We propose a novel textitsequence-to-sequence (seq2seq) reward modeling method.
arXiv Detail & Related papers (2024-08-30T16:14:35Z) - FSM: A Finite State Machine Based Zero-Shot Prompting Paradigm for Multi-Hop Question Answering [26.398873686905063]
Large Language Models (LLMs) with chain-of-thought (COT) prompting have demonstrated impressive abilities on simple nature language inference tasks.
We propose a prompting method, Finite State Machine (FSM) to enhance the reasoning capabilities of LLM for complex tasks.
arXiv Detail & Related papers (2024-07-03T10:01:01Z) - Aggregation of Reasoning: A Hierarchical Framework for Enhancing Answer Selection in Large Language Models [84.15513004135576]
Current research enhances the reasoning performance of Large Language Models (LLMs) by sampling multiple reasoning chains and ensembling based on the answer frequency.
This approach fails in scenarios where the correct answers are in the minority.
We introduce a hierarchical reasoning aggregation framework AoR, which selects answers based on the evaluation of reasoning chains.
arXiv Detail & Related papers (2024-05-21T17:12:19Z) - Re-Ex: Revising after Explanation Reduces the Factual Errors in LLM Responses [9.956253757863145]
We propose Re-Ex, a method for post-editing large language models (LLMs)-generated responses.
Re-Ex introduces a novel reasoning step dubbed as the factual error explanation step.
In addition to the explanation step, Re-Ex also incorporates new prompting techniques to reduce the token count and inference time required for the response revision process.
arXiv Detail & Related papers (2024-02-27T00:22:18Z) - Rephrase and Respond: Let Large Language Models Ask Better Questions for Themselves [57.974103113675795]
We present a method named Rephrase and Respond' (RaR) which allows Large Language Models to rephrase and expand questions posed by humans.
RaR serves as a simple yet effective prompting method for improving performance.
We show that RaR is complementary to the popular Chain-of-Thought (CoT) methods, both theoretically and empirically.
arXiv Detail & Related papers (2023-11-07T18:43:34Z) - Re-Reading Improves Reasoning in Large Language Models [87.46256176508376]
We introduce a simple, yet general and effective prompting method, Re2, to enhance the reasoning capabilities of off-the-shelf Large Language Models (LLMs)
Unlike most thought-eliciting prompting methods, such as Chain-of-Thought (CoT), Re2 shifts the focus to the input by processing questions twice, thereby enhancing the understanding process.
We evaluate Re2 on extensive reasoning benchmarks across 14 datasets, spanning 112 experiments, to validate its effectiveness and generality.
arXiv Detail & Related papers (2023-09-12T14:36:23Z) - RCOT: Detecting and Rectifying Factual Inconsistency in Reasoning by
Reversing Chain-of-Thought [56.558892336235914]
Reversing Chain-of-Thought (RCoT) is a novel method to improve large language models' reasoning abilities.
RCoT automatically detects and rectifys factual inconsistency in generated solutions.
We show that manually written fine-grained feedback can dramatically improve LLMs' reasoning abilities.
arXiv Detail & Related papers (2023-05-19T08:02:52Z) - Answering Questions by Meta-Reasoning over Multiple Chains of Thought [53.55653437903948]
We introduce Multi-Chain Reasoning (MCR), an approach which prompts large language models to meta-reason over multiple chains of thought.
MCR examines different reasoning chains, mixes information between them and selects the most relevant facts in generating an explanation and predicting the answer.
arXiv Detail & Related papers (2023-04-25T17:27:37Z) - Enhancing Chain-of-Thoughts Prompting with Iterative Bootstrapping in Large Language Models [81.01397924280612]
Large language models (LLMs) can achieve highly effective performance on various reasoning tasks by incorporating step-by-step chain-of-thought (CoT) prompting as demonstrations.
We introduce Iter-CoT (Iterative bootstrapping in Chain-of-Thoughts Prompting), an iterative bootstrapping approach for selecting exemplars and generating reasoning chains.
arXiv Detail & Related papers (2023-04-23T13:54:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.