Related papers: Towards Understanding Chain-of-Thought Prompting: An Empirical Study of What Matters

Towards Understanding Chain-of-Thought Prompting: An Empirical Study of What Matters

URL: http://arxiv.org/abs/2212.10001v2
Date: Thu, 1 Jun 2023 05:38:00 GMT
Title: Towards Understanding Chain-of-Thought Prompting: An Empirical Study of What Matters
Authors: Boshi Wang, Sewon Min, Xiang Deng, Jiaming Shen, You Wu, Luke Zettlemoyer, Huan Sun
Abstract summary: Chain-of-Thought (CoT) prompting can dramatically improve the multi-step reasoning abilities of large language models (LLMs) We show that CoT reasoning is possible even with invalid demonstrations.
Score: 82.84696222087396
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Chain-of-Thought (CoT) prompting can dramatically improve the multi-step reasoning abilities of large language models (LLMs). CoT explicitly encourages the LLM to generate intermediate rationales for solving a problem, by providing a series of reasoning steps in the demonstrations. Despite its success, there is still little understanding of what makes CoT prompting effective and which aspects of the demonstrated reasoning steps contribute to its performance. In this paper, we show that CoT reasoning is possible even with invalid demonstrations - prompting with invalid reasoning steps can achieve over 80-90% of the performance obtained using CoT under various metrics, while still generating coherent lines of reasoning during inference. Further experiments show that other aspects of the rationales, such as being relevant to the query and correctly ordering the reasoning steps, are much more important for effective CoT reasoning. Overall, these findings both deepen our understanding of CoT prompting, and open up new questions regarding LLMs' capability to learn to reason in context.

Related papers

The Curse of CoT: On the Limitations of Chain-of-Thought in In-Context Learning [39.613595533503144]
Chain-of-Thought (CoT) prompting has been widely recognized for its ability to enhance reasoning capabilities in large language models. We show that CoT consistently underperforms direct answering across varying model scales and benchmark complexities. Our analysis uncovers a fundamental explicit-implicit duality driving CoT's performance in pattern-based ICL.
arXiv Detail & Related papers (2025-04-07T13:51:06Z)
Unlocking General Long Chain-of-Thought Reasoning Capabilities of Large Language Models via Representation Engineering [59.34894142132706]
Existing work finds that the capability of long CoT reasoning can be efficiently elicited by tuning on only a few examples. This motivates us to investigate whether long CoT reasoning is a general capability for LLMs. We propose GLoRE, a novel representation engineering method to unleash the general long CoT reasoning capabilities of LLMs.
arXiv Detail & Related papers (2025-03-14T11:30:37Z)
Rethinking Thinking Tokens: Understanding Why They Underperform in Practice [6.102559098873098]
Thinking Tokens (TT) have been proposed as an unsupervised method to facilitate reasoning in language models. We show that TTs marginally improves performance and consistently underperforms compared to Chain-of-Thought (CoT) reasoning.
arXiv Detail & Related papers (2024-11-18T08:34:38Z)
Markov Chain of Thought for Efficient Mathematical Reasoning [10.678633785012691]
Chain of Thought (CoT) of multi-step benefits from the logical structure of the reasoning steps and task-specific actions. We conceptualize the standard multi-step CoT as a novel Markov Chain of Thought (MCoT)
arXiv Detail & Related papers (2024-10-23T07:53:29Z)
A Hopfieldian View-based Interpretation for Chain-of-Thought Reasoning [48.51969964676017]
Chain-of-Thought (CoT) holds a significant place in augmenting the reasoning performance for large language models. We propose a Read-and-Control approach for controlling the accuracy of CoT.
arXiv Detail & Related papers (2024-06-18T04:07:13Z)
The Impact of Reasoning Step Length on Large Language Models [40.546685248243534]
Chain of Thought (CoT) is significant in improving the reasoning abilities of large language models. We investigate the correlation between the effectiveness of CoT and the length of reasoning steps in prompts.
arXiv Detail & Related papers (2024-01-10T04:37:38Z)
Igniting Language Intelligence: The Hitchhiker's Guide From Chain-of-Thought Reasoning to Language Agents [80.5213198675411]
Large language models (LLMs) have dramatically enhanced the field of language intelligence. LLMs leverage the intriguing chain-of-thought (CoT) reasoning techniques, obliging them to formulate intermediate steps en route to deriving an answer. Recent research endeavors have extended CoT reasoning methodologies to nurture the development of autonomous language agents.
arXiv Detail & Related papers (2023-11-20T14:30:55Z)
Towards Better Chain-of-Thought Prompting Strategies: A Survey [60.75420407216108]
Chain-of-Thought (CoT) shows its impressive strength when used as a prompting strategy for large language models (LLM) Recent years, the prominent effect of CoT prompting has attracted emerging research. This survey could provide an overall reference on related research.
arXiv Detail & Related papers (2023-10-08T01:16:55Z)
Enhancing Chain-of-Thoughts Prompting with Iterative Bootstrapping in Large Language Models [81.01397924280612]
Large language models (LLMs) can achieve highly effective performance on various reasoning tasks by incorporating step-by-step chain-of-thought (CoT) prompting as demonstrations. We introduce Iter-CoT (Iterative bootstrapping in Chain-of-Thoughts Prompting), an iterative bootstrapping approach for selecting exemplars and generating reasoning chains.
arXiv Detail & Related papers (2023-04-23T13:54:39Z)
Complementary Explanations for Effective In-Context Learning [77.83124315634386]
Large language models (LLMs) have exhibited remarkable capabilities in learning from explanations in prompts. This work aims to better understand the mechanisms by which explanations are used for in-context learning.
arXiv Detail & Related papers (2022-11-25T04:40:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.