Self-Steering Language Models
- URL: http://arxiv.org/abs/2504.07081v1
- Date: Wed, 09 Apr 2025 17:54:22 GMT
- Title: Self-Steering Language Models
- Authors: Gabriel Grand, Joshua B. Tenenbaum, Vikash K. Mansinghka, Alexander K. Lew, Jacob Andreas,
- Abstract summary: DisCIPL is a method for "self-steering" language models.<n>DisCIPL uses a Planner model to generate a task-specific inference program.<n>Our work opens up a design space of highly-parallelized Monte Carlo inference strategies.
- Score: 113.96916935955842
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: While test-time reasoning enables language models to tackle complex tasks, searching or planning in natural language can be slow, costly, and error-prone. But even when LMs struggle to emulate the precise reasoning steps needed to solve a problem, they often excel at describing its abstract structure--both how to verify solutions and how to search for them. This paper introduces DisCIPL, a method for "self-steering" LMs where a Planner model generates a task-specific inference program that is executed by a population of Follower models. Our approach equips LMs with the ability to write recursive search procedures that guide LM inference, enabling new forms of verifiable and efficient reasoning. When instantiated with a small Follower (e.g., Llama-3.2-1B), DisCIPL matches (and sometimes outperforms) much larger models, including GPT-4o and o1, on challenging constrained generation tasks. In decoupling planning from execution, our work opens up a design space of highly-parallelized Monte Carlo inference strategies that outperform standard best-of-N sampling, require no finetuning, and can be implemented automatically by existing LMs.
Related papers
- Inference-Time Intervention in Large Language Models for Reliable Requirement Verification [2.3759432635713895]
Inference-time intervention techniques provide a promising alternative to fine-tuning.<n>We demonstrate how interventions enable fine-grained control for automating the usually time-intensive requirement verification process.<n>Our method achieves robust and reliable outputs, significantly improving over both a baseline model and a fine-tuning approach.
arXiv Detail & Related papers (2025-03-18T10:49:36Z) - Complex LLM Planning via Automated Heuristics Discovery [48.07520536415374]
We consider enhancing large language models (LLMs) for complex planning tasks.
We propose automated inferences discovery (AutoHD), a novel approach that enables LLMs to explicitly generate functions to guide-time search.
Our proposed method requires no additional model training or finetuning--and the explicit definition of functions generated by the LLMs provides interpretability and insights into the reasoning process.
arXiv Detail & Related papers (2025-02-26T16:52:31Z) - Interactive and Expressive Code-Augmented Planning with Large Language Models [62.799579304821826]
Large Language Models (LLMs) demonstrate strong abilities in common-sense reasoning and interactive decision-making.
Recent techniques have sought to structure LLM outputs using control flow and other code-adjacent techniques to improve planning performance.
We propose REPL-Plan, an LLM planning approach that is fully code-expressive and dynamic.
arXiv Detail & Related papers (2024-11-21T04:23:17Z) - Instruction Position Matters in Sequence Generation with Large Language
Models [67.87516654892343]
Large language models (LLMs) are capable of performing conditional sequence generation tasks, such as translation or summarization.
We propose enhancing the instruction-following capability of LLMs by shifting the position of task instructions after the input sentences.
arXiv Detail & Related papers (2023-08-23T12:36:57Z) - Coupling Large Language Models with Logic Programming for Robust and
General Reasoning from Text [5.532477732693001]
We show that a large language model can serve as a highly effective few-shot semantically.
It can convert natural language sentences into a logical form that serves as input for answer set programs.
We demonstrate that this method achieves state-of-the-art performance on several benchmarks, including bAbI, StepGame, CLUTRR, and gSCAN.
arXiv Detail & Related papers (2023-07-15T03:29:59Z) - Large Language Models as General Pattern Machines [64.75501424160748]
We show that pre-trained large language models (LLMs) are capable of autoregressively completing complex token sequences.
Surprisingly, pattern completion proficiency can be partially retained even when the sequences are expressed using tokens randomly sampled from the vocabulary.
In this work, we investigate how these zero-shot capabilities may be applied to problems in robotics.
arXiv Detail & Related papers (2023-07-10T17:32:13Z) - Sequential Monte Carlo Steering of Large Language Models using
Probabilistic Programs [46.721838623748816]
We propose a new inference-time approach to enforcing syntactic and semantic constraints on the outputs of large language models.
Key idea is to specify language generation tasks as posterior inference problems in a class of discrete probabilistic sequence models.
For a computational cost similar to that of beam search, SMC can steer LLMs to solve diverse tasks.
arXiv Detail & Related papers (2023-06-05T17:55:05Z) - Language Models Implement Simple Word2Vec-style Vector Arithmetic [32.2976613483151]
A primary criticism towards language models (LMs) is their inscrutability.
This paper presents evidence that, despite their size and complexity, LMs sometimes exploit a simple vector arithmetic style mechanism to solve some relational tasks.
arXiv Detail & Related papers (2023-05-25T15:04:01Z) - SatLM: Satisfiability-Aided Language Models Using Declarative Prompting [68.40726892904286]
We propose a new satisfiability-aided language modeling (SatLM) approach for improving the reasoning capabilities of large language models (LLMs)
We use an LLM to generate a declarative task specification rather than an imperative program and leverage an off-the-shelf automated theorem prover to derive the final answer.
We evaluate SATLM on 8 different datasets and show that it consistently outperforms program-aided LMs in the imperative paradigm.
arXiv Detail & Related papers (2023-05-16T17:55:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.