Nash CoT: Multi-Path Inference with Preference Equilibrium
- URL: http://arxiv.org/abs/2407.07099v1
- Date: Tue, 18 Jun 2024 07:46:13 GMT
- Title: Nash CoT: Multi-Path Inference with Preference Equilibrium
- Authors: Ziqi Zhang, Cunxiang Wang, Xiong Xiao, Yue Zhang, Donglin Wang,
- Abstract summary: Chain-of-thought (CoT) prompting has emerged as a powerful technique for enhancing the reasoning capabilities of Large Language Models (LLMs)
We conceptualize Symbolic language decoding as a preference consensus game, constructing a bi-player gaming system within each local path, and introduce Nash Chain-of-Thought (Nash CoT)
We achieve comparable or improved performance compared to self-consistency while using fewer inference paths on various inference tasks, including Arabic reasoning, Commonsense Question answering, and inference.
- Score: 40.50811042423615
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Chain-of-thought (CoT) prompting has emerged as a powerful technique for enhancing the reasoning capabilities of Large Language Models (LLMs) on complex problems. Among CoT-related studies, self-consistency (Multi-path inference with answer filtering through voting) involves generating multiple reasoning paths using the CoT framework and then selecting the most frequently produced outputs standing out as a concise yet competitive approach. While self-consistency has indeed led to the improvements in LLM inference, the use of multi-path inference also escalates deployment costs. Therefore, maintaining the performance benefits of self-consistency inherited from multi-path inference while reducing the inference costs holds significant value. In this research, we conceptualize language decoding as a preference consensus game, constructing a bi-player gaming system within each local path, and introduce Nash Chain-of-Thought (Nash CoT). Specifically, for a given question, we leverage LLM to autonomously select the contextually relevant template and generate outputs guided by this template, aiming to reach Nash Equilibrium alongside normal generation in each path. This approach allows us to achieve comparable or improved performance compared to self-consistency while using fewer inference paths on various inference tasks, including Arabic reasoning, Commonsense Question answering, and Symbolic inference.
Related papers
- Fine-Tuning with Divergent Chains of Thought Boosts Reasoning Through Self-Correction in Language Models [63.36637269634553]
We present a novel method of further improving performance by requiring models to compare multiple reasoning chains.
We find that instruction tuning on DCoT datasets boosts the performance of even smaller, and therefore more accessible, language models.
arXiv Detail & Related papers (2024-07-03T15:01:18Z) - Chain of Preference Optimization: Improving Chain-of-Thought Reasoning in LLMs [37.147529569445396]
Tree-of-thought (ToT) method employs tree-searching to extensively explore the reasoning space and find better reasoning paths that CoT decoding might overlook.
Fine-tuning language models (LLMs) leveraging the search tree constructed by ToT allows CoT to achieve similar or better performance.
This is achieved through Chain of Preference Optimization (CPO), where LLMs are fine-tuned to align each step of the CoT reasoning paths with those of ToT.
arXiv Detail & Related papers (2024-06-13T14:07:02Z) - Universal Self-Consistency for Large Language Model Generation [72.6761480346095]
Self-consistency with chain-of-thought prompting (CoT) has demonstrated remarkable performance gains on challenging tasks.
We propose Universal Self-Consistency (USC), which leverages large language models (LLMs) to select the most consistent answer.
arXiv Detail & Related papers (2023-11-29T02:07:09Z) - Recursion of Thought: A Divide-and-Conquer Approach to Multi-Context
Reasoning with Language Models [58.41943058963672]
We propose a new inference framework called Recursion of Thought (RoT)
RoT introduces several special tokens that the models can output to trigger context-related operations.
Experiments with multiple architectures including GPT-3 show that RoT dramatically improves LMs' inference capability to solve problems.
arXiv Detail & Related papers (2023-06-12T06:34:16Z) - Enhancing Chain-of-Thoughts Prompting with Iterative Bootstrapping in Large Language Models [81.01397924280612]
Large language models (LLMs) can achieve highly effective performance on various reasoning tasks by incorporating step-by-step chain-of-thought (CoT) prompting as demonstrations.
We introduce Iter-CoT (Iterative bootstrapping in Chain-of-Thoughts Prompting), an iterative bootstrapping approach for selecting exemplars and generating reasoning chains.
arXiv Detail & Related papers (2023-04-23T13:54:39Z) - Multimodal Chain-of-Thought Reasoning in Language Models [94.70184390935661]
We propose Multimodal-CoT that incorporates language (text) and vision (images) modalities into a two-stage framework.
Experimental results on ScienceQA and A-OKVQA benchmark datasets show the effectiveness of our proposed approach.
arXiv Detail & Related papers (2023-02-02T07:51:19Z) - ENIGMA Anonymous: Symbol-Independent Inference Guiding Machine (system
description) [0.4893345190925177]
We describe an implementation of gradient boosting and neural guidance of saturation-style automated theorem provers.
For the gradient-boosting guidance, we manually create abstracted features by considering arity-based encodings of formulas.
For the neural guidance, we use symbol-independent graph neural networks (GNNs) and their embedding of the terms and clauses.
arXiv Detail & Related papers (2020-02-13T09:44:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.