Generating Chain-of-Thoughts with a Pairwise-Comparison Approach to Searching for the Most Promising Intermediate Thought
- URL: http://arxiv.org/abs/2402.06918v2
- Date: Wed, 26 Jun 2024 05:47:52 GMT
- Title: Generating Chain-of-Thoughts with a Pairwise-Comparison Approach to Searching for the Most Promising Intermediate Thought
- Authors: Zhen-Yu Zhang, Siwei Han, Huaxiu Yao, Gang Niu, Masashi Sugiyama,
- Abstract summary: Chain-of-thoughts (CoT) methods were proposed to guide large language models to reason step-by-step, enabling problem solving from simple to complex.
The evaluation from the large language model (LLMs) is typically noisy and unreliable, potentially misleading the generation process in selecting promising intermediate thoughts.
In this paper, motivated by Vapnik's principle, we use pairwise-comparison evaluation instead of point-wise scoring to search for promising intermediate thoughts.
- Score: 70.30423016640749
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: To improve the ability of the large language model (LLMs) to tackle complex reasoning problems, chain-of-thoughts (CoT) methods were proposed to guide LLMs to reason step-by-step, enabling problem solving from simple to complex. State-of-the-art methods for generating such a chain involve interactive collaboration, where the learner generates candidate intermediate thoughts, evaluated by the LLM, guiding the generation of subsequent thoughts. However, a widespread yet understudied problem is that the evaluation from the LLM is typically noisy and unreliable, potentially misleading the generation process in selecting promising intermediate thoughts. In this paper, motivated by Vapnik's principle, we use pairwise-comparison evaluation instead of point-wise scoring to search for promising intermediate thoughts with the noisy feedback from the LLM. In each round, we randomly pair intermediate thoughts and directly prompt the LLM to select the more promising one from each pair, allowing us to identify the most promising thoughts through an iterative process. To further alleviate the noise in the comparison, we incorporate techniques from ensemble learning and dueling bandits, proposing two variants of the algorithm. Experiments on three real-world tasks demonstrate the effectiveness of our proposed algorithm and verify the rationale of the pairwise comparison mechanism.
Related papers
- Closer Look at Efficient Inference Methods: A Survey of Speculative Decoding [1.3479499607624648]
Speculative decoding addresses bottleneck by introducing a two-stage framework: drafting and verification.
A smaller, efficient model generates a preliminary draft, which is then refined by a larger, more sophisticated model.
This paper provides a comprehensive survey of speculative decoding methods, categorizing them into draft-centric and model-centric approaches.
arXiv Detail & Related papers (2024-11-20T09:46:30Z) - LLaMA-Berry: Pairwise Optimization for O1-like Olympiad-Level Mathematical Reasoning [56.273799410256075]
The framework combines Monte Carlo Tree Search (MCTS) with iterative Self-Refine to optimize the reasoning path.
The framework has been tested on general and advanced benchmarks, showing superior performance in terms of search efficiency and problem-solving capability.
arXiv Detail & Related papers (2024-10-03T18:12:29Z) - Iteration of Thought: Leveraging Inner Dialogue for Autonomous Large Language Model Reasoning [0.0]
Iterative human engagement is a common and effective means of leveraging the advanced language processing power of large language models (LLMs)
We propose the Iteration of Thought (IoT) framework for enhancing LLM responses by generating "thought"-provoking prompts.
Unlike static or semi-static approaches, IoT adapts its reasoning path dynamically, based on evolving context.
arXiv Detail & Related papers (2024-09-19T09:44:17Z) - Sentiment Analysis through LLM Negotiations [58.67939611291001]
A standard paradigm for sentiment analysis is to rely on a singular LLM and makes the decision in a single round.
This paper introduces a multi-LLM negotiation framework for sentiment analysis.
arXiv Detail & Related papers (2023-11-03T12:35:29Z) - R$^3$ Prompting: Review, Rephrase and Resolve for Chain-of-Thought
Reasoning in Large Language Models under Noisy Context [12.475979274233458]
We propose a novel prompting method, namely R$3$ prompting, for Chain-of-Thought (CoT) reasoning under noisy context.
Our experiments show that R$3$ prompting significantly outperforms existing CoT prompting methods on five reasoning tasks under noisy context.
arXiv Detail & Related papers (2023-10-25T10:34:02Z) - Plan, Verify and Switch: Integrated Reasoning with Diverse X-of-Thoughts [65.15322403136238]
We propose XoT, an integrated problem solving framework by prompting LLMs with diverse reasoning thoughts.
For each question, XoT always begins with selecting the most suitable method then executes each method iteratively.
Within each iteration, XoT actively checks the validity of the generated answer and incorporates the feedback from external executors.
arXiv Detail & Related papers (2023-10-23T07:02:20Z) - Eliminating Reasoning via Inferring with Planning: A New Framework to
Guide LLMs' Non-linear Thinking [40.22335733384235]
Chain-of-Thought(CoT) prompting and its variants explore equipping large language models with high-level reasoning abilities.
We propose textbfInferential textbfExclusion textbfPrompting (IEP), a novel prompting that combines the principles of elimination and inference.
arXiv Detail & Related papers (2023-10-18T21:42:16Z) - Modeling Uncertainty and Using Post-fusion as Fallback Improves Retrieval Augmented Generation with LLMs [80.74263278847063]
The integration of retrieved passages and large language models (LLMs) has significantly contributed to improving open-domain question answering.
This paper investigates different methods of combining retrieved passages with LLMs to enhance answer generation.
arXiv Detail & Related papers (2023-08-24T05:26:54Z) - Algorithm of Thoughts: Enhancing Exploration of Ideas in Large Language Models [17.059322033670124]
We propose a novel strategy that propels Large Language Models through algorithmic reasoning pathways.
Our results suggest that instructing an LLM using an algorithm can lead to performance surpassing that of the algorithm itself.
arXiv Detail & Related papers (2023-08-20T22:36:23Z) - Encouraging Divergent Thinking in Large Language Models through Multi-Agent Debate [85.3444184685235]
We propose a Multi-Agent Debate (MAD) framework, in which multiple agents express their arguments in the state of "tit for tat" and a judge manages the debate process to obtain a final solution.
Our framework encourages divergent thinking in LLMs which would be helpful for tasks that require deep levels of contemplation.
arXiv Detail & Related papers (2023-05-30T15:25:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.