Related papers: Buffer of Thoughts: Thought-Augmented Reasoning with Large Language Models

Buffer of Thoughts: Thought-Augmented Reasoning with Large Language Models

URL: http://arxiv.org/abs/2406.04271v2
Date: Mon, 14 Oct 2024 07:12:27 GMT
Title: Buffer of Thoughts: Thought-Augmented Reasoning with Large Language Models
Authors: Ling Yang, Zhaochen Yu, Tianjun Zhang, Shiyi Cao, Minkai Xu, Wentao Zhang, Joseph E. Gonzalez, Bin Cui,
Abstract summary: Buffer of Thoughts (BoT) is a novel and versatile thought-augmented reasoning approach. We propose meta-buffer to store a series of informative high-level thoughts. For each problem, we retrieve a relevant thought-template and adaptively instantiate it with specific reasoning structures.
Score: 65.48185395952788
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We introduce Buffer of Thoughts (BoT), a novel and versatile thought-augmented reasoning approach for enhancing accuracy, efficiency and robustness of large language models (LLMs). Specifically, we propose meta-buffer to store a series of informative high-level thoughts, namely thought-template, distilled from the problem-solving processes across various tasks. Then for each problem, we retrieve a relevant thought-template and adaptively instantiate it with specific reasoning structures to conduct efficient reasoning. To guarantee the scalability and stability, we further propose buffer-manager to dynamically update the meta-buffer, thus enhancing the capacity of meta-buffer as more tasks are solved. We conduct extensive experiments on 10 challenging reasoning-intensive tasks, and achieve significant performance improvements over previous SOTA methods: 11% on Game of 24, 20% on Geometric Shapes and 51% on Checkmate-in-One. Further analysis demonstrate the superior generalization ability and model robustness of our BoT, while requiring only 12% of the cost of multi-query prompting methods (e.g., tree/graph of thoughts) on average. Notably, we find that our Llama3-8B+BoT has the potential to surpass Llama3-70B model. Our project is available at: https://github.com/YangLing0818/buffer-of-thought-llm

Related papers

Logit Arithmetic Elicits Long Reasoning Capabilities Without Training [14.015546463427732]
Large reasoning models (LRMs) can do complex reasoning via long chain-of-thought (CoT) involving cognitive strategies such as backtracking and self-correction.<n>Recent studies suggest that some models inherently possess these long reasoning abilities, which may be unlocked via extra training.<n>We propose a decoding-time approach, ThinkLogit, to tune a target large LM for long reasoning using a substantially smaller model as guider.
arXiv Detail & Related papers (2025-07-17T03:31:36Z)
A*-Thought: Efficient Reasoning via Bidirectional Compression for Low-Resource Settings [64.36404136352287]
A*-Thought is an efficient tree search-based unified framework designed to identify and isolate the most essential thoughts.<n>It formulates the reasoning process of LRMs as a search tree, where each node represents a reasoning span in the giant reasoning space.<n>It can improve the performance of QwQ-32B by 2.39$times$ with low-budget and reduce the length of the output token by nearly 50% with high-budget.
arXiv Detail & Related papers (2025-05-30T12:58:34Z)
Adaptive Thinking via Mode Policy Optimization for Social Language Agents [75.3092060637826]
We propose a framework to improve the adaptive thinking ability of language agents in dynamic social interactions.<n>Our framework advances existing research in three key aspects: (1) Multi-granular thinking mode design, (2) Context-aware mode switching across social interaction, and (3) Token-efficient reasoning via depth-adaptive processing.
arXiv Detail & Related papers (2025-05-04T15:39:58Z)
MetaScale: Test-Time Scaling with Evolving Meta-Thoughts [51.35594569020857]
Experimental results demonstrate that MetaScale consistently outperforms standard inference approaches. METASCALE scales more effectively with increasing sampling budgets and produces more structured, expert-level responses.
arXiv Detail & Related papers (2025-03-17T17:59:54Z)
Towards Thinking-Optimal Scaling of Test-Time Compute for LLM Reasoning [113.49074603075032]
Recent studies have shown that making a model spend more time thinking through longer Chain of Thoughts (CoTs) enables it to gain significant improvements in complex reasoning tasks. We explore whether scaling with longer CoTs can indeed impair the reasoning performance of Large Language Models (LLMs) in certain domains.
arXiv Detail & Related papers (2025-02-25T10:48:05Z)
BoT: Breaking Long Thought Processes of o1-like Large Language Models through Backdoor Attack [36.89710026479849]
We introduce a novel attack scenario targeting the long thought processes of o1-like models. We propose BoT, which can selectively break intrinsic reasoning mechanisms through backdoor attacks. Experiments on open-source o1-like models, including recent DeepSeek-R1, demonstrate that BoT nearly achieves high attack success rates.
arXiv Detail & Related papers (2025-02-16T10:45:56Z)
Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach [70.44265766483633]
We study a novel language model architecture that is capable of scaling test-time computation by implicitly reasoning in latent space. Our model works by iterating a recurrent block, thereby unrolling to arbitrary depth at test-time. We show that the resulting model can improve its performance on reasoning benchmarks, sometimes dramatically.
arXiv Detail & Related papers (2025-02-07T18:55:02Z)
Reasoning Paths Optimization: Learning to Reason and Explore From Diverse Paths [69.39559168050923]
We introduce Reasoning Paths Optimization (RPO), which enables learning to reason and explore from diverse paths. Our approach encourages favorable branches at each reasoning step while penalizing unfavorable ones, enhancing the model's overall problem-solving performance. We focus on multi-step reasoning tasks, such as math word problems and science-based exam questions.
arXiv Detail & Related papers (2024-10-07T06:37:25Z)
Can Github issues be solved with Tree Of Thoughts? [0.0]
This research introduces the application of the Tree of Thoughts (ToT) language model reasoning framework for enhancing the decision-making and problem-solving abilities of LLMs. We experimentally deploy ToT in tackling a Github issue contained within an instance of the SWE-bench.
arXiv Detail & Related papers (2024-05-20T11:05:56Z)
Masked Thought: Simply Masking Partial Reasoning Steps Can Improve Mathematical Reasoning Learning of Language Models [102.72940700598055]
In reasoning tasks, even a minor error can cascade into inaccurate results. We develop a method that avoids introducing external resources, relying instead on perturbations to the input. Our training approach randomly masks certain tokens within the chain of thought, a technique we found to be particularly effective for reasoning tasks.
arXiv Detail & Related papers (2024-03-04T16:21:54Z)
Uncertainty of Thoughts: Uncertainty-Aware Planning Enhances Information Seeking in Large Language Models [73.79091519226026]
Uncertainty of Thoughts (UoT) is an algorithm to augment large language models with the ability to actively seek information by asking effective questions. In experiments on medical diagnosis, troubleshooting, and the 20 Questions game, UoT achieves an average performance improvement of 38.1% in the rate of successful task completion.
arXiv Detail & Related papers (2024-02-05T18:28:44Z)
Large-Batch, Iteration-Efficient Neural Bayesian Design Optimization [37.339567743948955]
We present a novel Bayesian optimization framework specifically tailored to address the limitations of BO. Our key contribution is a highly scalable, sample-based acquisition function that performs a non-dominated sorting of objectives. We show that our acquisition function in combination with different Bayesian neural network surrogates is effective in data-intensive environments with a minimal number of iterations.
arXiv Detail & Related papers (2023-06-01T19:10:57Z)
Tree of Thoughts: Deliberate Problem Solving with Large Language Models [52.31950122881687]
We introduce a new framework for language model inference, Tree of Thoughts (ToT) ToT generalizes over the popular Chain of Thought approach to prompting language models. Our experiments show that ToT significantly enhances language models' problem-solving abilities.
arXiv Detail & Related papers (2023-05-17T23:16:17Z)
Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them [108.54545521369688]
We focus on a suite of 23 challenging BIG-Bench tasks which we call BIG-Bench Hard (BBH) We find that applying chain-of-thought (CoT) prompting to BBH tasks enables PaLM to surpass the average human-rater performance on 10 of the 23 tasks, and Codex to surpass the average human-rater performance on 17 of the 23 tasks.
arXiv Detail & Related papers (2022-10-17T17:08:26Z)
MBORE: Multi-objective Bayesian Optimisation by Density-Ratio Estimation [0.01652719262940403]
optimisation problems often have multiple conflicting objectives that can be computationally and/or financially expensive. Mono-surrogate Bayesian optimisation (BO) is a popular model-based approach for optimising such black-box functions. We extend previous work on BO by density-ratio estimation (BORE) to the multi-objective setting.
arXiv Detail & Related papers (2022-03-31T09:27:59Z)
Scalable Combinatorial Bayesian Optimization with Tractable Statistical models [44.25245545568633]
We study the problem of optimizing blackbox functions over Relaxation spaces (e.g., sets, sequences, trees, and graphs) Based on recent advances in submodular relaxation, we study an approach as Parametrized Submodular (PSR) towards the goal of improving the scalability and accuracy of solving AFO problems for BOCS model. Experiments on diverse benchmark problems show significant improvements with PSR for BOCS model.
arXiv Detail & Related papers (2020-08-18T22:56:46Z)

This list is automatically generated from the titles and abstracts of the papers in this site.