String Seed of Thought: Prompting LLMs for Distribution-Faithful and Diverse Generation
- URL: http://arxiv.org/abs/2510.21150v2
- Date: Fri, 07 Nov 2025 06:59:25 GMT
- Title: String Seed of Thought: Prompting LLMs for Distribution-Faithful and Diverse Generation
- Authors: Kou Misaki, Takuya Akiba,
- Abstract summary: We introduce String Seed of Thought (SSoT), a novel prompting method for LLMs that improves Probabilistic Instruction Following (PIF)<n>We demonstrate that SSoT significantly improves the PIF performance of LLMs, approaching the ideal performance of a pseudo-random number generator.
- Score: 7.499410407885288
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We introduce String Seed of Thought (SSoT), a novel prompting method for LLMs that improves Probabilistic Instruction Following (PIF). We define PIF as a task requiring an LLM to select its answer from a predefined set of options, each associated with a specific probability, such that the empirical distribution of the generated answers aligns with the target distribution when prompted multiple times. While LLMs excel at tasks with single, deterministic answers, they often fail at PIF, exhibiting biases problematic for applications requiring non-deterministic behaviors, such as human-behavior simulation, content diversification, and multiplayer games. It also harms the diversity of generated responses, a crucial factor in test-time scaling, by causing the outputs to collapse into a limited set of answers. To address this, we propose SSoT, a simple prompting method that instructs an LLM to first output a random string to generate sufficient entropy. SSoT also instructs the LLM to extract randomness by manipulating this string to derive a final answer, thereby preserving diversity while adhering to specific constraints. We demonstrate that SSoT significantly improves the PIF performance of LLMs, approaching the ideal performance of a pseudo-random number generator. Furthermore, our experiments on NoveltyBench show SSoT's benefits extend beyond closed-set tasks to open-ended tasks by enhancing response diversity.
Related papers
- Task-Awareness Improves LLM Generations and Uncertainty [48.857040212979484]
Bayes-optimal responses consistently outperform standard decoding methods like beam search.<n>Our decision-theoretic framework is applicable to any problem that admits a latent response structure.
arXiv Detail & Related papers (2026-01-29T10:16:23Z) - Addressing LLM Diversity by Infusing Random Concepts [0.3951835393164164]
Large language models (LLMs) are known to produce outputs with limited diversity.<n>In this work, we study whether infusing random concepts in the prompts can improve the diversity of the generated outputs.
arXiv Detail & Related papers (2026-01-26T00:53:28Z) - Diffusion LLMs are Natural Adversaries for any LLM [50.88535293540971]
We introduce a novel framework that transforms the resource-intensive (adversarial) prompt optimization problem into an emphefficient, amortized inference task<n>Our core insight is that pretrained, non-autoregressive generative LLMs, can serve as powerful surrogates for prompt search.<n>We find that the generated prompts are low-perplexity, diverse jailbreaks that exhibit strong transferability to a wide range of black-box target models.
arXiv Detail & Related papers (2025-10-31T19:04:09Z) - Evaluating the Quality of Randomness and Entropy in Tasks Supported by Large Language Models [8.339789704552706]
Large language model (LLM) technology has led to diverse applications, many of which inherently require randomness.<n>This paper investigates the capacity of LLMs for handling tasks that involve randomness through a series of experiments.<n>Experiments cover a range of tasks, including generating random numbers, generating random strings such as passwords, shuffling items, and evaluating the quality of randomness.
arXiv Detail & Related papers (2025-10-14T02:43:08Z) - Learning to Reason Across Parallel Samples for LLM Reasoning [48.41933431325965]
Scaling test-time compute brings substantial performance gains for large language models (LLMs)<n>In this paper, we propose a new way to leverage such multiple sample set.<n> Experiments on five reasoning datasets demonstrate both the efficacy and efficiency of SSA.
arXiv Detail & Related papers (2025-06-10T17:42:35Z) - Iterative Self-Incentivization Empowers Large Language Models as Agentic Searchers [74.17516978246152]
Large language models (LLMs) have been widely integrated into information retrieval to advance traditional techniques.<n>We propose EXSEARCH, an agentic search framework, where the LLM learns to retrieve useful information as the reasoning unfolds.<n>Experiments on four knowledge-intensive benchmarks show that EXSEARCH substantially outperforms baselines.
arXiv Detail & Related papers (2025-05-26T15:27:55Z) - Set-LLM: A Permutation-Invariant LLM [2.9665130256021]
This paper is motivated by a specific vulnerability: the order sensitivity of large language models (LLMs)<n>We introduce Set-LLM, a novel architectural adaptation for pretrained LLMs that enables the processing of mixed set-text inputs with permutation invariance guarantees.
arXiv Detail & Related papers (2025-05-21T12:14:26Z) - FSM: A Finite State Machine Based Zero-Shot Prompting Paradigm for Multi-Hop Question Answering [26.398873686905063]
Large Language Models (LLMs) with chain-of-thought (COT) prompting have demonstrated impressive abilities on simple nature language inference tasks.
We propose a prompting method, Finite State Machine (FSM) to enhance the reasoning capabilities of LLM for complex tasks.
arXiv Detail & Related papers (2024-07-03T10:01:01Z) - Q*: Improving Multi-step Reasoning for LLMs with Deliberative Planning [53.6472920229013]
Large Language Models (LLMs) have demonstrated impressive capability in many natural language tasks.
LLMs are prone to produce errors, hallucinations and inconsistent statements when performing multi-step reasoning.
We introduce Q*, a framework for guiding LLMs decoding process with deliberative planning.
arXiv Detail & Related papers (2024-06-20T13:08:09Z) - Order-Independence Without Fine Tuning [18.020492646988746]
We present Set-Based Prompting, a technique that guarantees the output of an LLM will not have order dependence on a specified set of sub-sequences.<n>Despite our inputs being out of distribution, the impact on expected accuracy is small, where the expectation is over the order of uniformly chosen shuffling of the candidate responses.
arXiv Detail & Related papers (2024-06-04T16:09:13Z) - Amortizing intractable inference in large language models [56.92471123778389]
We use amortized Bayesian inference to sample from intractable posterior distributions.
We empirically demonstrate that this distribution-matching paradigm of LLM fine-tuning can serve as an effective alternative to maximum-likelihood training.
As an important application, we interpret chain-of-thought reasoning as a latent variable modeling problem.
arXiv Detail & Related papers (2023-10-06T16:36:08Z) - LaGR-SEQ: Language-Guided Reinforcement Learning with Sample-Efficient
Querying [71.86163159193327]
Large language models (LLMs) have recently demonstrated their impressive ability to provide context-aware responses via text.
This ability could potentially be used to predict plausible solutions in sequential decision making tasks pertaining to pattern completion.
We introduce LaGR, which uses this predictive ability of LLMs to propose solutions to tasks that have been partially completed by a primary reinforcement learning (RL) agent.
arXiv Detail & Related papers (2023-08-21T02:07:35Z) - Check Your Facts and Try Again: Improving Large Language Models with
External Knowledge and Automated Feedback [127.75419038610455]
Large language models (LLMs) are able to generate human-like, fluent responses for many downstream tasks.
This paper proposes a LLM-Augmenter system, which augments a black-box LLM with a set of plug-and-play modules.
arXiv Detail & Related papers (2023-02-24T18:48:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.