Evolving Deeper LLM Thinking
- URL: http://arxiv.org/abs/2501.09891v1
- Date: Fri, 17 Jan 2025 00:41:44 GMT
- Title: Evolving Deeper LLM Thinking
- Authors: Kuang-Huei Lee, Ian Fischer, Yueh-Hua Wu, Dave Marwood, Shumeet Baluja, Dale Schuurmans, Xinyun Chen,
- Abstract summary: The proposed approach, Mind Evolution, uses a language model to generate, recombine and refine candidate responses.
Mind Evolution significantly outperforms other inference strategies such as Best-of-N and Sequential Revision in natural language planning tasks.
- Score: 61.61227021098086
- License:
- Abstract: We explore an evolutionary search strategy for scaling inference time compute in Large Language Models. The proposed approach, Mind Evolution, uses a language model to generate, recombine and refine candidate responses. The proposed approach avoids the need to formalize the underlying inference problem whenever a solution evaluator is available. Controlling for inference cost, we find that Mind Evolution significantly outperforms other inference strategies such as Best-of-N and Sequential Revision in natural language planning tasks. In the TravelPlanner and Natural Plan benchmarks, Mind Evolution solves more than 98% of the problem instances using Gemini 1.5 Pro without the use of a formal solver.
Related papers
- Do NOT Think That Much for 2+3=? On the Overthinking of o1-Like LLMs [76.43407125275202]
o1-like models can emulate human-like long-time thinking during inference.
This paper presents the first comprehensive study on the prevalent issue of overthinking in these models.
We propose strategies to mitigate overthinking, streamlining reasoning processes without compromising accuracy.
arXiv Detail & Related papers (2024-12-30T18:55:12Z) - Recursive Introspection: Teaching Language Model Agents How to Self-Improve [30.086494067593268]
We develop RISE: Recursive IntroSpEction, an approach for fine-tuning large language models.
Our experiments show that RISE enables Llama2, Llama3, and Mistral models to improve themselves with more turns on math reasoning tasks.
arXiv Detail & Related papers (2024-07-25T17:35:59Z) - MR-Ben: A Meta-Reasoning Benchmark for Evaluating System-2 Thinking in LLMs [55.20845457594977]
Large language models (LLMs) have shown increasing capability in problem-solving and decision-making.
We present a process-based benchmark MR-Ben that demands a meta-reasoning skill.
Our meta-reasoning paradigm is especially suited for system-2 slow thinking.
arXiv Detail & Related papers (2024-06-20T03:50:23Z) - Leveraging automatic strategy discovery to teach people how to select better projects [0.9821874476902969]
The decisions of individuals and organizations are often suboptimal because normative decision strategies are too demanding in the real world.
Recent work suggests that some errors can be prevented by leveraging artificial intelligence to discover and teach prescriptive decision strategies.
This article is the first to extend this approach to a real-world decision problem, namely project selection.
arXiv Detail & Related papers (2024-06-06T13:51:44Z) - Discovering Evolution Strategies via Meta-Black-Box Optimization [23.956974467496345]
We propose to discover effective update rules for evolution strategies via meta-learning.
Our approach employs a search strategy parametrized by a self-attention-based architecture.
We show that it is possible to self-referentially train an evolution strategy from scratch, with the learned update rule used to drive the outer meta-learning loop.
arXiv Detail & Related papers (2022-11-21T08:48:46Z) - Socio-cognitive Optimization of Time-delay Control Problems using
Evolutionary Metaheuristics [89.24951036534168]
Metaheuristics are universal optimization algorithms which should be used for solving difficult problems, unsolvable by classic approaches.
In this paper we aim at constructing novel socio-cognitive metaheuristic based on castes, and apply several versions of this algorithm to optimization of time-delay system model.
arXiv Detail & Related papers (2022-10-23T22:21:10Z) - Runtime Analysis of Competitive co-Evolutionary Algorithms for Maximin Optimisation of a Bilinear Function [1.3053649021965603]
Co-evolutionary algorithms have a wide range of applications, such as in hardware design, evolution of strategies for board games, and patching software bugs.
It is an open challenge to develop a theory that can predict when co-evolutionary algorithms find solutions efficiently and reliable.
This paper provides a first step in developing runtime analysis for population-based competitive co-evolutionary algorithms.
arXiv Detail & Related papers (2022-06-30T12:35:36Z) - Learning MDPs from Features: Predict-Then-Optimize for Sequential
Decision Problems by Reinforcement Learning [52.74071439183113]
We study the predict-then-optimize framework in the context of sequential decision problems (formulated as MDPs) solved via reinforcement learning.
Two significant computational challenges arise in applying decision-focused learning to MDPs.
arXiv Detail & Related papers (2021-06-06T23:53:31Z) - Meta-Learning with Neural Tangent Kernels [58.06951624702086]
We propose the first meta-learning paradigm in the Reproducing Kernel Hilbert Space (RKHS) induced by the meta-model's Neural Tangent Kernel (NTK)
Within this paradigm, we introduce two meta-learning algorithms, which no longer need a sub-optimal iterative inner-loop adaptation as in the MAML framework.
We achieve this goal by 1) replacing the adaptation with a fast-adaptive regularizer in the RKHS; and 2) solving the adaptation analytically based on the NTK theory.
arXiv Detail & Related papers (2021-02-07T20:53:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.