Related papers: Stream of Search (SoS): Learning to Search in Language

Stream of Search (SoS): Learning to Search in Language

URL: http://arxiv.org/abs/2404.03683v1
Date: Mon, 1 Apr 2024 06:50:52 GMT
Title: Stream of Search (SoS): Learning to Search in Language
Authors: Kanishk Gandhi, Denise Lee, Gabriel Grand, Muxin Liu, Winson Cheng, Archit Sharma, Noah D. Goodman,
Abstract summary: We show how language models can be taught to search by representing the process of search in language as a flattened string. We propose a unified language for search that captures an array of different symbolic search strategies. Our results indicate that language models can learn to solve problems via search, self-improve to flexibly use different search strategies, and potentially discover new ones.
Score: 29.841835308845948
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Language models are rarely shown fruitful mistakes while training. They then struggle to look beyond the next token, suffering from a snowballing of errors and struggling to predict the consequence of their actions several steps ahead. In this paper, we show how language models can be taught to search by representing the process of search in language, as a flattened string -- a stream of search (SoS). We propose a unified language for search that captures an array of different symbolic search strategies. We demonstrate our approach using the simple yet difficult game of Countdown, where the goal is to combine input numbers with arithmetic operations to reach a target number. We pretrain a transformer-based language model from scratch on a dataset of streams of search generated by heuristic solvers. We find that SoS pretraining increases search accuracy by 25% over models trained to predict only the optimal search trajectory. We further finetune this model with two policy improvement methods: Advantage-Induced Policy Alignment (APA) and Self-Taught Reasoner (STaR). The finetuned SoS models solve 36% of previously unsolved problems, including problems that cannot be solved by any of the heuristic solvers. Our results indicate that language models can learn to solve problems via search, self-improve to flexibly use different search strategies, and potentially discover new ones.

Related papers

Planning In Natural Language Improves LLM Search For Code Generation [5.370466208990696]
We propose PlanSearch, a novel search algorithm for solving problems in natural language. PlanSearch shows strong results across HumanEval+, MBPP+, and LiveCodeBench. We show that, across all models, search algorithms, and benchmarks analyzed, we can accurately predict performance gains due to search.
arXiv Detail & Related papers (2024-09-05T17:44:49Z)
A Training Data Recipe to Accelerate A* Search with Language Models [3.037409201025504]
Large Language Models (LLMs) with search algorithms like A* holds the promise of enhanced reasoning and scalable inference. We empirically disentangle the requirements of A* search algorithm from the requirements of the LLM to generalise on this task. Our technique reduces the number of iterations required to find the solutions by up to 15x, with a wall-clock speed-up of search up to 5x.
arXiv Detail & Related papers (2024-07-13T19:21:44Z)
In-Context Language Learning: Architectures and Algorithms [73.93205821154605]
We study ICL through the lens of a new family of model problems we term in context language learning (ICLL) We evaluate a diverse set of neural sequence models on regular ICLL tasks.
arXiv Detail & Related papers (2024-01-23T18:59:21Z)
Frontier Language Models are not Robust to Adversarial Arithmetic, or "What do I need to say so you agree 2+2=5? [88.59136033348378]
We study the problem of adversarial arithmetic, which provides a simple yet challenging testbed for language model alignment. This problem is comprised of arithmetic questions posed in natural language, with an arbitrary adversarial string inserted before the question is complete. We show that models can be partially hardened against these attacks via reinforcement learning and via agentic constitutional loops.
arXiv Detail & Related papers (2023-11-08T19:07:10Z)
Interpretability at Scale: Identifying Causal Mechanisms in Alpaca [62.65877150123775]
We use Boundless DAS to efficiently search for interpretable causal structure in large language models while they follow instructions. Our findings mark a first step toward faithfully understanding the inner-workings of our ever-growing and most widely deployed language models.
arXiv Detail & Related papers (2023-05-15T17:15:40Z)
Language Model Pre-Training with Sparse Latent Typing [66.75786739499604]
We propose a new pre-training objective, Sparse Latent Typing, which enables the model to sparsely extract sentence-level keywords with diverse latent types. Experimental results show that our model is able to learn interpretable latent type categories in a self-supervised manner without using any external knowledge.
arXiv Detail & Related papers (2022-10-23T00:37:08Z)
Zero-Shot Learners for Natural Language Understanding via a Unified Multiple Choice Perspective [26.41585967095811]
Zero-shot learning aims to train a model on a given task such that it can address new learning tasks without any additional training. Our approach converts zero-shot learning into multiple-choice tasks, avoiding problems in commonly used large-scale generative models such as FLAN. Our approach shows state-of-the-art performance on several benchmarks and produces satisfactory results on tasks such as natural language inference and text classification.
arXiv Detail & Related papers (2022-10-16T17:24:06Z)
Regularized Contrastive Learning of Semantic Search [0.0]
Transformer-based models are widely used as retrieval models due to their excellent ability to learn semantic representations. We propose a new regularization method: Regularized Contrastive Learning. It augments several different semantic representations for every sentence, then take them into the contrastive objective as regulators.
arXiv Detail & Related papers (2022-09-27T08:25:19Z)
Probing Structured Pruning on Multilingual Pre-trained Models: Settings, Algorithms, and Efficiency [62.0887259003594]
This work investigates three aspects of structured pruning on multilingual pre-trained language models: settings, algorithms, and efficiency. Experiments on nine downstream tasks show several counter-intuitive phenomena. We present Dynamic Sparsification, a simple approach that allows training the model once and adapting to different model sizes at inference.
arXiv Detail & Related papers (2022-04-06T06:29:52Z)
Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing [78.8500633981247]
This paper surveys and organizes research works in a new paradigm in natural language processing, which we dub "prompt-based learning" Unlike traditional supervised learning, which trains a model to take in an input x and predict an output y as P(y|x), prompt-based learning is based on language models that model the probability of text directly.
arXiv Detail & Related papers (2021-07-28T18:09:46Z)
Efficient Active Search for Combinatorial Optimization Problems [1.6543719822033436]
We show that (efficient) active search enables learned models to effectively solve instances that are much larger than those seen during training. The proposed methods offer a simple way to significantly improve the search performance of a given model and outperform state-of-the-art machine learning based methods on routing problems.
arXiv Detail & Related papers (2021-06-09T15:08:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.