Text and Patterns: For Effective Chain of Thought, It Takes Two to Tango
- URL: http://arxiv.org/abs/2209.07686v1
- Date: Fri, 16 Sep 2022 02:54:00 GMT
- Title: Text and Patterns: For Effective Chain of Thought, It Takes Two to Tango
- Authors: Aman Madaan and Amir Yazdanbakhsh
- Abstract summary: This work initiates the preliminary steps towards a deeper understanding of reasoning mechanisms in large language models.
Our work centers around querying the model while controlling for all but one of the components in a prompt: symbols, patterns, and text.
We posit that text imbues patterns with commonsense knowledge and meaning.
- Score: 11.344587937052697
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Reasoning is a key pillar of human cognition and intelligence. In the past
decade, we witnessed dramatic gains in natural language processing and
unprecedented scaling of large language models. Recent work has characterized
the capability of few-shot prompting techniques such as chain of thought to
emulate human reasoning in large language models. This hallmark feature of
few-shot prompting, combined with ever scaling language models, opened a vista
of possibilities to solve various tasks, such as math word problems, code
completion, and commonsense reasoning. Chain of thought (CoT) prompting further
pushes the performance of models in a few-shot setup, by supplying intermediate
steps and urging the model to follow the same process. Despite its compelling
performance, the genesis of reasoning capability in these models is less
explored. This work initiates the preliminary steps towards a deeper
understanding of reasoning mechanisms in large language models. Our work
centers around querying the model while controlling for all but one of the
components in a prompt: symbols, patterns, and text. We then analyze the
performance divergence across the queries. Our results suggest the presence of
factual patterns in a prompt is not necessary for the success of CoT.
Nonetheless, we empirically show that relying solely on patterns is also
insufficient for high quality results. We posit that text imbues patterns with
commonsense knowledge and meaning. Our exhaustive empirical analysis provides
qualitative examples of the symbiotic relationship between text and patterns.
Such systematic understanding of CoT enables us to devise concise chain of
thought, dubbed as CCoT, where text and patterns are pruned to only retain
their key roles, while delivering on par or slightly higher solve task rate.
Related papers
- How Interpretable are Reasoning Explanations from Prompting Large Language Models? [34.4659592398593]
We present a comprehensive and multifaceted evaluation of interpretability, examining not only faithfulness but also robustness and utility across commonsense reasoning benchmarks.
In addition, we introduce a simple interpretability alignment technique termed Self-Entailment-Alignment Chain-of-thought, that yields more than 70% improvements across multiple dimensions of interpretability.
arXiv Detail & Related papers (2024-02-19T06:11:28Z) - Igniting Language Intelligence: The Hitchhiker's Guide From
Chain-of-Thought Reasoning to Language Agents [80.5213198675411]
Large language models (LLMs) have dramatically enhanced the field of language intelligence.
LLMs leverage the intriguing chain-of-thought (CoT) reasoning techniques, obliging them to formulate intermediate steps en route to deriving an answer.
Recent research endeavors have extended CoT reasoning methodologies to nurture the development of autonomous language agents.
arXiv Detail & Related papers (2023-11-20T14:30:55Z) - Large Language Models as Analogical Reasoners [155.9617224350088]
Chain-of-thought (CoT) prompting for language models demonstrates impressive performance across reasoning tasks.
We introduce a new prompting approach, analogical prompting, designed to automatically guide the reasoning process of large language models.
arXiv Detail & Related papers (2023-10-03T00:57:26Z) - Beyond Chain-of-Thought, Effective Graph-of-Thought Reasoning in Language Models [74.40196814292426]
We propose Graph-of-Thought (GoT) reasoning, which models human thought processes not only as a chain but also as a graph.
GoT captures the non-sequential nature of human thinking and allows for a more realistic modeling of thought processes.
We evaluate GoT's performance on a text-only reasoning task and a multimodal reasoning task.
arXiv Detail & Related papers (2023-05-26T02:15:09Z) - Visual Chain of Thought: Bridging Logical Gaps with Multimodal
Infillings [61.04460792203266]
We introduce VCoT, a novel method that leverages chain-of-thought prompting with vision-language grounding to bridge the logical gaps within sequential data.
Our method uses visual guidance to generate synthetic multimodal infillings that add consistent and novel information to reduce the logical gaps for downstream tasks.
arXiv Detail & Related papers (2023-05-03T17:58:29Z) - Chain of Thought Prompt Tuning in Vision Language Models [29.85907584680661]
We propose a novel chain of thought prompt tuning for vision-language modeling.
We are the first to successfully adapt chain-of-thought prompting that combines visual and textual embeddings.
arXiv Detail & Related papers (2023-04-16T23:59:25Z) - Chaining Simultaneous Thoughts for Numerical Reasoning [92.2007997126144]
numerical reasoning over text should be an essential skill of AI systems.
Previous work focused on modeling the structures of equations, and has proposed various structured decoders.
We propose CANTOR, a numerical reasoner that models reasoning steps using a directed acyclic graph.
arXiv Detail & Related papers (2022-11-29T18:52:06Z) - Chain of Thought Prompting Elicits Reasoning in Large Language Models [56.811278668446825]
This paper explores the ability of language models to generate a coherent chain of thought.
Experiments show that inducing a chain of thought via prompting can enable sufficiently large language models to better perform reasoning tasks.
arXiv Detail & Related papers (2022-01-28T02:33:07Z) - Seeking Patterns, Not just Memorizing Procedures: Contrastive Learning
for Solving Math Word Problems [14.144577791030853]
We investigate how a neural network understands patterns only from semantics.
We propose a contrastive learning approach, where the neural network perceives the divergence of patterns.
Our method greatly improves the performance in monolingual and multilingual settings.
arXiv Detail & Related papers (2021-10-16T04:03:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.