Related papers: Evolving Deeper LLM Thinking

Evolving Deeper LLM Thinking

URL: http://arxiv.org/abs/2501.09891v1
Date: Fri, 17 Jan 2025 00:41:44 GMT
Title: Evolving Deeper LLM Thinking
Authors: Kuang-Huei Lee, Ian Fischer, Yueh-Hua Wu, Dave Marwood, Shumeet Baluja, Dale Schuurmans, Xinyun Chen,
Abstract summary: The proposed approach, Mind Evolution, uses a language model to generate, recombine and refine candidate responses.<n> Mind Evolution significantly outperforms other inference strategies such as Best-of-N and Sequential Revision in natural language planning tasks.
Score: 61.61227021098086
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We explore an evolutionary search strategy for scaling inference time compute in Large Language Models. The proposed approach, Mind Evolution, uses a language model to generate, recombine and refine candidate responses. The proposed approach avoids the need to formalize the underlying inference problem whenever a solution evaluator is available. Controlling for inference cost, we find that Mind Evolution significantly outperforms other inference strategies such as Best-of-N and Sequential Revision in natural language planning tasks. In the TravelPlanner and Natural Plan benchmarks, Mind Evolution solves more than 98% of the problem instances using Gemini 1.5 Pro without the use of a formal solver.

Related papers

Self-Steering Language Models [113.96916935955842]
DisCIPL is a method for "self-steering" language models. DisCIPL uses a Planner model to generate a task-specific inference program. Our work opens up a design space of highly-parallelized Monte Carlo inference strategies.
arXiv Detail & Related papers (2025-04-09T17:54:22Z)
MetaScale: Test-Time Scaling with Evolving Meta-Thoughts [51.35594569020857]
Experimental results demonstrate that MetaScale consistently outperforms standard inference approaches. METASCALE scales more effectively with increasing sampling budgets and produces more structured, expert-level responses.
arXiv Detail & Related papers (2025-03-17T17:59:54Z)
Meta-Reasoner: Dynamic Guidance for Optimized Inference-time Reasoning in Large Language Models [31.556646366268286]
Large Language Models increasingly rely on prolonged reasoning chains to solve complex tasks. This trial-and-error approach often leads to high computational overhead and error propagation. We introduce Meta-Reasoner, a framework that dynamically optimize inference-time reasoning.
arXiv Detail & Related papers (2025-02-27T09:40:13Z)
Do NOT Think That Much for 2+3=? On the Overthinking of o1-Like LLMs [76.43407125275202]
o1-like models can emulate human-like long-time thinking during inference.<n>This paper presents the first comprehensive study on the prevalent issue of overthinking in these models.<n>We propose strategies to mitigate overthinking, streamlining reasoning processes without compromising accuracy.
arXiv Detail & Related papers (2024-12-30T18:55:12Z)
Recursive Introspection: Teaching Language Model Agents How to Self-Improve [30.086494067593268]
We develop RISE: Recursive IntroSpEction, an approach for fine-tuning large language models. Our experiments show that RISE enables Llama2, Llama3, and Mistral models to improve themselves with more turns on math reasoning tasks.
arXiv Detail & Related papers (2024-07-25T17:35:59Z)
MR-Ben: A Meta-Reasoning Benchmark for Evaluating System-2 Thinking in LLMs [55.20845457594977]
Large language models (LLMs) have shown increasing capability in problem-solving and decision-making.<n>We present a process-based benchmark MR-Ben that demands a meta-reasoning skill.<n>Our meta-reasoning paradigm is especially suited for system-2 slow thinking.
arXiv Detail & Related papers (2024-06-20T03:50:23Z)
Leveraging automatic strategy discovery to teach people how to select better projects [0.9821874476902969]
The decisions of individuals and organizations are often suboptimal because normative decision strategies are too demanding in the real world. Recent work suggests that some errors can be prevented by leveraging artificial intelligence to discover and teach prescriptive decision strategies. This article is the first to extend this approach to a real-world decision problem, namely project selection.
arXiv Detail & Related papers (2024-06-06T13:51:44Z)
Discovering Evolution Strategies via Meta-Black-Box Optimization [23.956974467496345]
We propose to discover effective update rules for evolution strategies via meta-learning. Our approach employs a search strategy parametrized by a self-attention-based architecture. We show that it is possible to self-referentially train an evolution strategy from scratch, with the learned update rule used to drive the outer meta-learning loop.
arXiv Detail & Related papers (2022-11-21T08:48:46Z)
Socio-cognitive Optimization of Time-delay Control Problems using Evolutionary Metaheuristics [89.24951036534168]
Metaheuristics are universal optimization algorithms which should be used for solving difficult problems, unsolvable by classic approaches. In this paper we aim at constructing novel socio-cognitive metaheuristic based on castes, and apply several versions of this algorithm to optimization of time-delay system model.
arXiv Detail & Related papers (2022-10-23T22:21:10Z)
Runtime Analysis of Competitive co-Evolutionary Algorithms for Maximin Optimisation of a Bilinear Function [1.3053649021965603]
Co-evolutionary algorithms have a wide range of applications, such as in hardware design, evolution of strategies for board games, and patching software bugs. It is an open challenge to develop a theory that can predict when co-evolutionary algorithms find solutions efficiently and reliable. This paper provides a first step in developing runtime analysis for population-based competitive co-evolutionary algorithms.
arXiv Detail & Related papers (2022-06-30T12:35:36Z)
Meta-Learning with Neural Tangent Kernels [58.06951624702086]
We propose the first meta-learning paradigm in the Reproducing Kernel Hilbert Space (RKHS) induced by the meta-model's Neural Tangent Kernel (NTK) Within this paradigm, we introduce two meta-learning algorithms, which no longer need a sub-optimal iterative inner-loop adaptation as in the MAML framework. We achieve this goal by 1) replacing the adaptation with a fast-adaptive regularizer in the RKHS; and 2) solving the adaptation analytically based on the NTK theory.
arXiv Detail & Related papers (2021-02-07T20:53:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.