Automatic Prompt Augmentation and Selection with Chain-of-Thought from
Labeled Data
- URL: http://arxiv.org/abs/2302.12822v3
- Date: Tue, 27 Feb 2024 14:49:43 GMT
- Title: Automatic Prompt Augmentation and Selection with Chain-of-Thought from
Labeled Data
- Authors: KaShun Shum, Shizhe Diao, Tong Zhang
- Abstract summary: Chain-of-thought (CoT) advances the reasoning abilities of large language models (LLMs)
Most CoT studies rely on carefully designed human-annotated rational chains to prompt LLMs.
This paper proposes a new strategy that can bypass human engineering of CoT.
- Score: 20.68548644283721
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Chain-of-thought (CoT) advances the reasoning abilities of large language
models (LLMs) and achieves superior performance in complex reasoning tasks.
However, most CoT studies rely on carefully designed human-annotated rational
chains to prompt LLMs, posing challenges for real-world applications where
labeled data is available without rational chains. This paper proposes a new
strategy, Automate-CoT (Automatic Prompt Augmentation and Selection with
Chain-of-Thought), that can bypass human engineering of CoT by automatically
augmenting rational chains from a small labeled dataset, and then pruning
low-quality chains to construct a candidate pool of machine-generated rationale
chains based on the labels. Finally, it selects the optimal combination of
several rationale chains from the pool for CoT prompting by employing a
variance-reduced policy gradient strategy to estimate the significance of each
example. Automate-CoT enables a quick adaptation of the CoT technique to
different tasks. Experimental results demonstrate the effectiveness of our
method, where competitive results are achieved on arithmetic reasoning (+2.7%),
commonsense reasoning (+3.4%), symbolic reasoning (+3.2%), and non-reasoning
tasks (+2.5%). The code is available at
https://github.com/SHUMKASHUN/Automate-CoT.
Related papers
- Token Signature: Predicting Chain-of-Thought Gains with Token Decoding Feature in Large Language Models [9.282278040339138]
Chain-of-Thought (CoT) technique has proven effective in improving the performance of large language models (LLMs) on complex reasoning tasks.<n>We make a preliminary observation that the monotonicity of token probability distributions may be correlated with the gains achieved through CoT reasoning.<n>We propose two indicators based on the token probability distribution to assess CoT effectiveness across different tasks.
arXiv Detail & Related papers (2025-06-06T11:53:27Z) - Continuous Chain of Thought Enables Parallel Exploration and Reasoning [38.59659461841282]
Current language models generate chain-of-thought traces by autoregressively sampling tokens from a finite vocabulary.<n>Our work examines the benefits of continuously-valued tokens (CoT2) through logical reasoning tasks.<n>We show that CoT2 allows the model to track multiple traces in parallel and quantify its benefits for inference efficiency.
arXiv Detail & Related papers (2025-05-29T16:58:28Z) - Fractured Chain-of-Thought Reasoning [61.647243580650446]
We introduce Fractured Sampling, a unified inference-time strategy that interpolates between full CoT and solution-only sampling.<n>We show that Fractured Sampling consistently achieves superior accuracy-cost trade-offs, yielding steep log-linear scaling gains in Pass@k versus token budget.
arXiv Detail & Related papers (2025-05-19T11:30:41Z) - AdaCoT: Pareto-Optimal Adaptive Chain-of-Thought Triggering via Reinforcement Learning [30.265984245328124]
Chain-of-Thought prompting indiscriminately generates lengthy reasoning steps for all queries.<n>AdaCoT (Adaptive Chain-of-Thought) is a novel framework enabling LLMs to adaptively decide when to invoke CoT.<n>A key technical contribution is Selective Loss Masking (SLM), designed to counteract decision boundary collapse.
arXiv Detail & Related papers (2025-05-17T08:27:00Z) - START: Self-taught Reasoner with Tools [51.38785489790888]
We introduce START (Self-Taught Reasoner with Tools), a tool-integrated long Chain-of-thought (CoT) reasoning LLM.
START is capable of performing complex computations, self-checking, exploring diverse methods, and self-ging.
It significantly outperforms the base QwQ-32B and achieves performance comparable to the state-of-the-art open-weight model R1-Distill-Qwen-32B.
arXiv Detail & Related papers (2025-03-06T17:11:51Z) - Scalable Best-of-N Selection for Large Language Models via Self-Certainty [65.31658824274894]
Best-of-N selection is a key technique for improving the reasoning performance of Large Language Models.
We propose self-certainty, a novel and efficient metric to estimate response quality without requiring external reward models.
Our findings establish self-certainty as a practical and efficient way for improving LLM reasoning capabilities.
arXiv Detail & Related papers (2025-02-25T19:08:07Z) - CoT-Valve: Length-Compressible Chain-of-Thought Tuning [50.196317781229496]
We introduce a new tuning and inference strategy named CoT-Valve, designed to allow models to generate reasoning chains of varying lengths.
We show that CoT-Valve successfully enables controllability and compressibility of the chain and shows better performance than the prompt-based control.
arXiv Detail & Related papers (2025-02-13T18:52:36Z) - CDW-CoT: Clustered Distance-Weighted Chain-of-Thoughts Reasoning [16.502640216082547]
We propose the Clustered Distance-Weighted Chain of Thought (CDW-CoT) method.
It dynamically constructs prompts tailored to the characteristics of each data instance.
It consistently outperforms traditional CoT methods across six datasets.
arXiv Detail & Related papers (2025-01-21T15:51:07Z) - Cascaded Self-Evaluation Augmented Training for Lightweight Multimodal LLMs [14.763433457556136]
Multimodal Large Language Models (EMLLMs) can improve performance through Chain-of-Thought (CoT) reasoning.
They have poor self-evaluation capabilities during the CoT reasoning process.
This is due to their tendency to simplify the reasoning process and the degradation of self-evaluation ability during downstream task fine-tuning.
arXiv Detail & Related papers (2025-01-10T02:28:04Z) - To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning [55.52872152909785]
Chain-of-thought (CoT) via prompting is the de facto method for eliciting reasoning capabilities from large language models (LLMs)
We show that CoT gives strong performance benefits primarily on tasks involving math or logic, with much smaller gains on other types of tasks.
arXiv Detail & Related papers (2024-09-18T17:55:00Z) - Self-Harmonized Chain of Thought [8.540320749424172]
Chain-of-thought (CoT) prompting has demonstrated the capacity of large language models to perform complex reasoning through intermediate steps.
We propose ECHO, a novel method that unifies diverse solution paths into a consistent and effective reasoning pattern.
arXiv Detail & Related papers (2024-09-06T06:57:04Z) - Fine-Tuning on Diverse Reasoning Chains Drives Within-Inference CoT Refinement in LLMs [63.36637269634553]
We introduce a novel approach where LLMs are fine-tuned to generate a sequence of Diverse Chains of Thought (DCoT) within a single inference step.<n>We show that fine-tuning on DCoT improves performance over the CoT baseline across model families and scales.<n>Our work is also significant because both quantitative analyses and manual evaluations reveal the observed gains stem from the models' ability to refine an initial reasoning chain.
arXiv Detail & Related papers (2024-07-03T15:01:18Z) - Break the Chain: Large Language Models Can be Shortcut Reasoners [18.047917626825548]
Chain-of-Thought (CoT) reasoning utilize complex modules but are hampered by high token consumption, limited applicability, and challenges in thinking.
This paper conducts a critical evaluation of CoT prompting, extending beyond arithmetic to include complex logical and commonsense reasoning tasks.
We propose the integration of human-likes and shortcuts into language models (LMs) through "break the chain" strategies.
arXiv Detail & Related papers (2024-06-04T14:02:53Z) - ChainLM: Empowering Large Language Models with Improved Chain-of-Thought Prompting [124.69672273754144]
Chain-of-Thought (CoT) prompting can enhance the reasoning capabilities of large language models (LLMs)
Existing CoT approaches usually focus on simpler reasoning tasks and thus result in low-quality and inconsistent CoT prompts.
We introduce CoTGenius, a novel framework designed for the automatic generation of superior CoT prompts.
arXiv Detail & Related papers (2024-03-21T11:34:26Z) - Deep Reinforcement Learning for Modelling Protein Complexes [29.64786472108047]
We show that an acyclic undirected connected graph can be used to predict the structure of multi-chain protein complexes.
We propose GAPN, a Generative Adversarial Policy Network powered by domain-specific rewards and adversarial loss through policy gradient.
arXiv Detail & Related papers (2024-03-11T12:33:33Z) - AutoAct: Automatic Agent Learning from Scratch for QA via Self-Planning [54.47116888545878]
AutoAct is an automatic agent learning framework for QA.
It does not rely on large-scale annotated data and synthetic planning trajectories from closed-source models.
arXiv Detail & Related papers (2024-01-10T16:57:24Z) - DCR: Divide-and-Conquer Reasoning for Multi-choice Question Answering with LLMs [9.561022942046279]
We propose Divide and Conquer Reasoning (DCR) to enhance the reasoning capability of large language models (LLMs)
We first categorize questions into two subsets based on confidence score ($mathcalCS$), which is estimated by statistical frequency of generated answers.
In particular, we first categorize questions into two subsets based on confidence score ($mathcalCS$), which is estimated by statistical frequency of generated answers.
arXiv Detail & Related papers (2024-01-10T14:38:46Z) - LINC: A Neurosymbolic Approach for Logical Reasoning by Combining
Language Models with First-Order Logic Provers [60.009969929857704]
Logical reasoning is an important task for artificial intelligence with potential impacts on science, mathematics, and society.
In this work, we reformulating such tasks as modular neurosymbolic programming, which we call LINC.
We observe significant performance gains on FOLIO and a balanced subset of ProofWriter for three different models in nearly all experimental conditions we evaluate.
arXiv Detail & Related papers (2023-10-23T17:58:40Z) - Faithful Chain-of-Thought Reasoning [51.21714389639417]
Chain-of-Thought (CoT) prompting boosts Language Models' (LM) performance on a gamut of reasoning tasks.
We propose Faithful CoT, a reasoning framework involving two stages: Translation and Problem Solving.
This guarantees that the reasoning chain provides a faithful explanation of the final answer.
arXiv Detail & Related papers (2023-01-31T03:04:26Z) - Learning Autoencoders with Relational Regularization [89.53065887608088]
A new framework is proposed for learning autoencoders of data distributions.
We minimize the discrepancy between the model and target distributions, with a emphrelational regularization
We implement the framework with two scalable algorithms, making it applicable for both probabilistic and deterministic autoencoders.
arXiv Detail & Related papers (2020-02-07T17:27:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.