Related papers: Hint of Thought prompting: an explainable and zero-shot approach to reasoning tasks with LLMs

Hint of Thought prompting: an explainable and zero-shot approach to reasoning tasks with LLMs

URL: http://arxiv.org/abs/2305.11461v7
Date: Sun, 8 Sep 2024 06:41:12 GMT
Title: Hint of Thought prompting: an explainable and zero-shot approach to reasoning tasks with LLMs
Authors: Ioktong Lei, Zhidong Deng,
Abstract summary: This paper proposes a novel hint of thought (HoT) prompting with explain-ability and zero-shot generalization. It is decomposed into three steps: explainable sub-questions, logical reasoning, and answering. Experiments show that our HoT prompting has a significant advantage on the zero-shot reasoning task compared to existing zero-shot CoT.
Score: 5.996787847938559
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Prompting becomes an increasingly important research topic for better utilization of LLMs. Although simple prompting performs well on single-step questions, it cannot permanently activate the correct knowledge path for multi-step reasoning tasks. The chain of thought (CoT), which often contains zero-shot CoT and few-shot CoT, is a recently developed prompting method that can explain the reasoning process to the LLM and outperforms simple prompting in three challenging reasoning tasks, including arithmetic, symbolic, and commonsense reasoning. Inspired by zero-shot CoT, and further extending the zero-shot ability, this paper proposes a novel hint of thought (HoT) prompting with explain-ability and zero-shot generalization. It is decomposed into three steps: explainable sub-questions, logical reasoning, and answering. Such three steps are sequentially ordered in step-by-step hints, which can be easily adjusted and explained to different tasks. Finally, experimental results demonstrate that our HoT prompting has a significant advantage on the zero-shot reasoning task compared to existing zero-shot CoT. We did zero-shot experiments on math tasks like GSM8K, ADDSUB, AQUA, SVAMP, and commonsense tasks such as StrategyQA. In particular, the accuracy of the proposed HoT prompting is improved with GSM8K from 40.50% to 70.65%, with AQUA from 31.9% to 46.4%, with SVAMP from 63.7% to 76.9%, and with ADDSUB from 74.7% to 87.34%, respectively, which even defeats the competitive PoT approach on GSM8k, AQUA, and SVAMP.

Related papers

VeriThinker: Learning to Verify Makes Reasoning Model Efficient [52.74493506816969]
Large Reasoning Models excel at complex tasks using Chain-of-Thought (CoT) reasoning.<n>Their tendency to overthinking leads to unnecessarily lengthy reasoning chains.<n>We introduce VeriThinker, a novel approach for CoT compression.
arXiv Detail & Related papers (2025-05-23T14:17:56Z)
LlamaV-o1: Rethinking Step-by-step Visual Reasoning in LLMs [103.0226977561914]
We propose a comprehensive framework for advancing step-by-step visual reasoning in large language models. We introduce a visual reasoning benchmark specifically designed to evaluate multi-step reasoning tasks. Second, we propose a novel metric that assesses visual reasoning quality at the granularity of individual steps. Third, we present a new multimodal visual reasoning model, named LlamaV-o1, trained using a multi-step curriculum learning approach.
arXiv Detail & Related papers (2025-01-10T18:59:51Z)
Instance-adaptive Zero-shot Chain-of-Thought Prompting [32.700073951068575]
Zero-shot Chain-of-Thought (CoT) prompting emerges as a simple and effective strategy for enhancing the performance of large language models (LLMs) in real-world reasoning tasks. This work introduces an instance-adaptive prompting algorithm as an alternative zero-shot CoT reasoning scheme by adaptively differentiating good and bad prompts.
arXiv Detail & Related papers (2024-09-30T16:00:34Z)
Achieving >97% on GSM8K: Deeply Understanding the Problems Makes LLMs Better Solvers for Math Word Problems [50.76385564061713]
Chain-of-Thought (CoT) prompting has enhanced the performance of Large Language Models (LLMs) across various reasoning tasks. CoT usually suffers from three pitfalls: semantic misunderstanding errors, calculation errors, and step-missing errors. We propose Deeply Understanding the Problems (DUP) to improve the LLMs' math problem-solving ability by addressing semantic misunderstanding errors.
arXiv Detail & Related papers (2024-04-23T12:16:05Z)
Large Language Models are Contrastive Reasoners [8.427805316635318]
We show how contrastive prompting significantly improves the ability of large language models to perform complex reasoning. Experiments on various large language models show that zero-shot contrastive prompting improves performance on a range of arithmetic, commonsense, and symbolic reasoning tasks. Our method not only surpasses zero-shot CoT and few-shot CoT in most arithmetic and commonsense reasoning tasks but also can seamlessly integrate with existing prompting methods.
arXiv Detail & Related papers (2024-03-13T03:15:05Z)
Hint-before-Solving Prompting: Guiding LLMs to Effectively Utilize Encoded Knowledge [85.17343729885003]
We introduce Hint-before-Solving Prompting (HSP), which guides the model to generate hints for solving the problem. HSP can effectively improve the accuracy of reasoning tasks. We build the HSPMATH dataset based on HSP and fine-tuned Llemma-7B, reaching 64.3 accuracy.
arXiv Detail & Related papers (2024-02-22T05:58:03Z)
In-Context Principle Learning from Mistakes [75.66979331850364]
Incontext learning (ICL) has been the standard method of adapting LLMs to downstream tasks, by learning from a few input-output examples. We revisit this paradigm, by learning more from the few given input-output examples.
arXiv Detail & Related papers (2024-02-08T04:42:29Z)
Evidence to Generate (E2G): A Single-agent Two-step Prompting for Context Grounded and Retrieval Augmented Reasoning [3.117335706912261]
We introduce Evidence to Generate (E2G), a novel single-agent, two-step prompting framework. Instead of unverified reasoning claims, E2G focuses exclusively on the thought sequences explicitly mentioned in the context. tool achieves remarkable results robustly across a wide range of knowledge-intensive reasoning and generation tasks.
arXiv Detail & Related papers (2024-01-11T09:49:15Z)
Resprompt: Residual Connection Prompting Advances Multi-Step Reasoning in Large Language Models [73.4425450752596]
Chain-of-thought (CoT) prompting has impressively unlocked the reasoning potential of large language models (LLMs) Yet, the standard CoT is less effective in problems demanding multiple reasoning steps. We propose RESPROMPT, a new prompting strategy that advances multi-step reasoning in LLMs.
arXiv Detail & Related papers (2023-10-07T08:56:28Z)
Plan-and-Solve Prompting: Improving Zero-Shot Chain-of-Thought Reasoning by Large Language Models [23.805926737723603]
A few manually crafted step-by-step reasoning demonstrations can be used to generate reasoning steps for large language models (LLMs) Zero-shot-CoTs prompts the target problem statement with "Let's think step by step" as an input prompt to LLMs. We show that our proposed zero-shot prompting consistently outperforms Zero-shot-CoT across all datasets by a large margin.
arXiv Detail & Related papers (2023-05-06T16:34:37Z)
Faithful Chain-of-Thought Reasoning [51.21714389639417]
Chain-of-Thought (CoT) prompting boosts Language Models' (LM) performance on a gamut of reasoning tasks. We propose Faithful CoT, a reasoning framework involving two stages: Translation and Problem Solving. This guarantees that the reasoning chain provides a faithful explanation of the final answer.
arXiv Detail & Related papers (2023-01-31T03:04:26Z)
Towards Understanding Chain-of-Thought Prompting: An Empirical Study of What Matters [82.84696222087396]
Chain-of-Thought (CoT) prompting can dramatically improve the multi-step reasoning abilities of large language models (LLMs) We show that CoT reasoning is possible even with invalid demonstrations.
arXiv Detail & Related papers (2022-12-20T05:20:54Z)
Large Language Models are Zero-Shot Reasoners [28.6899375595088]
Chain of thought (CoT) prompting is a technique for eliciting complex multi-step reasoning through step-by-step answer examples. We show that LLMs are decent zero-shot reasoners by simply adding Let's think step by step'' before each answer. Experimental results demonstrate that our Zero-shot-CoT, using the same single prompt template, significantly outperforms zero-shot LLM performances.
arXiv Detail & Related papers (2022-05-24T09:22:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.