Related papers: BED-LLM: Intelligent Information Gathering with LLMs and Bayesian Experimental Design

BED-LLM: Intelligent Information Gathering with LLMs and Bayesian Experimental Design

URL: http://arxiv.org/abs/2508.21184v2
Date: Sat, 18 Oct 2025 23:14:21 GMT
Title: BED-LLM: Intelligent Information Gathering with LLMs and Bayesian Experimental Design
Authors: Deepro Choudhury, Sinead Williamson, Adam Goliński, Ning Miao, Freddie Bickford Smith, Michael Kirchhof, Yizhe Zhang, Tom Rainforth,
Abstract summary: We propose a general-purpose approach for improving the ability of Large Language Models (LLMs) to intelligently and adaptively gather information from a user or other external source.<n>Our approach, which we call BED-LLM, is based on iteratively choosing questions that maximize the expected information gain.<n>We find that BED-LLM achieves substantial gains in performance across a range of tests based on the 20 questions game.
Score: 20.03498575187842
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We propose a general-purpose approach for improving the ability of Large Language Models (LLMs) to intelligently and adaptively gather information from a user or other external source using the framework of sequential Bayesian experimental design (BED). This enables LLMs to act as effective multi-turn conversational agents and interactively interface with external environments. Our approach, which we call BED-LLM (Bayesian Experimental Design with Large Language Models), is based on iteratively choosing questions or queries that maximize the expected information gain (EIG) about the task of interest given the responses gathered previously. We show how this EIG can be formulated (and then estimated) in a principled way using a probabilistic model derived from the LLM's predictive distributions and provide detailed insights into key decisions in its construction and updating procedure. We find that BED-LLM achieves substantial gains in performance across a wide range of tests based on the 20 questions game and using the LLM to actively infer user preferences, compared to direct prompting of the LLM and other adaptive design strategies.

Related papers

Training-Free Active Learning Framework in Materials Science with Large Language Models [1.7173772511677432]
Large language models (LLMs) offer a new paradigm by leveraging their pretrained knowledge and universal token-based representations.<n>Here, we introduce an LLM-based active learning framework (LLM-AL) that operates in an iterative few-shot setting.
arXiv Detail & Related papers (2025-11-24T21:46:29Z)
LLM-as-a-Judge: Toward World Models for Slate Recommendation Systems [5.310303349822993]
We investigate how Large Language Models (LLM) can act as world models of user preferences through pairwise reasoning over slates.<n>Our results reveal relationships between task performance and properties of the preference function captured by LLMs.
arXiv Detail & Related papers (2025-11-06T16:54:54Z)
NaViL: Rethinking Scaling Properties of Native Multimodal Large Language Models under Data Constraints [100.02131897927484]
This paper focuses on the native training of Multimodal Large Language Models (MLLMs) in an end-to-end manner.<n>We propose a native MLLM called NaViL, combined with a simple and cost-effective recipe.<n> Experimental results on 14 multimodal benchmarks confirm the competitive performance of NaViL against existing MLLMs.
arXiv Detail & Related papers (2025-10-09T17:59:37Z)
IDA-Bench: Evaluating LLMs on Interactive Guided Data Analysis [60.32962597618861]
IDA-Bench is a novel benchmark evaluating large language models in multi-round interactive scenarios.<n>Agent performance is judged by comparing its final numerical output to the human-derived baseline.<n>Even state-of-the-art coding agents (like Claude-3.7-thinking) succeed on 50% of the tasks, highlighting limitations not evident in single-turn tests.
arXiv Detail & Related papers (2025-05-23T09:37:52Z)
INFERENCEDYNAMICS: Efficient Routing Across LLMs through Structured Capability and Knowledge Profiling [44.309917620936474]
InferenceDynamics is a flexible and scalable multi-dimensional routing framework by modeling the capability and knowledge of models.<n>We operate it on our comprehensive dataset RouteMix, and demonstrate its effectiveness and generalizability in group-level routing.
arXiv Detail & Related papers (2025-05-22T06:56:51Z)
A Survey on the Optimization of Large Language Model-based Agents [16.733092886211097]
Large Language Models (LLMs) have been widely adopted in various fields, becoming essential for autonomous decision-making and interactive tasks.<n>However, current work typically relies on prompt design or fine-tuning strategies applied to vanilla LLMs.<n>We provide a comprehensive review of LLM-based agent optimization approaches, categorizing them into parameter-driven and parameter-free methods.
arXiv Detail & Related papers (2025-03-16T10:09:10Z)
Optimizing Knowledge Integration in Retrieval-Augmented Generation with Self-Selection [72.92366526004464]
Retrieval-Augmented Generation (RAG) has proven effective in enabling Large Language Models (LLMs) to produce more accurate and reliable responses.<n>We propose a novel Self-Selection RAG framework, where the LLM is made to select from pairwise responses generated with internal parametric knowledge solely.
arXiv Detail & Related papers (2025-02-10T04:29:36Z)
Reason4Rec: Large Language Models for Recommendation with Deliberative User Preference Alignment [69.11529841118671]
We propose a new Deliberative Recommendation task, which incorporates explicit reasoning about user preferences as an additional alignment goal.<n>We then introduce the Reasoning-powered Recommender framework for deliberative user preference alignment.
arXiv Detail & Related papers (2025-02-04T07:17:54Z)
LLM-based Bi-level Multi-interest Learning Framework for Sequential Recommendation [54.396000434574454]
We propose a novel multi-interest SR framework combining implicit behavioral and explicit semantic perspectives.<n>It includes two modules: the Implicit Behavioral Interest Module and the Explicit Semantic Interest Module.<n>Experiments on four real-world datasets validate the framework's effectiveness and practicality.
arXiv Detail & Related papers (2024-11-14T13:00:23Z)
SelectLLM: Query-Aware Efficient Selection Algorithm for Large Language Models [8.558834738072363]
Large language models (LLMs) have been widely adopted due to their remarkable performance across various applications.<n>These individual LLMs show limitations in generalization and performance on complex tasks due to inherent training biases, model size constraints, and the quality or diversity of pre-training datasets.<n>We introduce SelectLLM, which efficiently directs input queries to the most suitable subset of LLMs from a large pool.
arXiv Detail & Related papers (2024-08-16T06:11:21Z)
Fine-Tuning Large Vision-Language Models as Decision-Making Agents via Reinforcement Learning [79.38140606606126]
We propose an algorithmic framework that fine-tunes vision-language models (VLMs) with reinforcement learning (RL) Our framework provides a task description and then prompts the VLM to generate chain-of-thought (CoT) reasoning. We demonstrate that our proposed framework enhances the decision-making capabilities of VLM agents across various tasks.
arXiv Detail & Related papers (2024-05-16T17:50:19Z)
Unlocking the Potential of User Feedback: Leveraging Large Language Model as User Simulator to Enhance Dialogue System [65.93577256431125]
We propose an alternative approach called User-Guided Response Optimization (UGRO) to combine it with a smaller task-oriented dialogue model. This approach uses LLM as annotation-free user simulator to assess dialogue responses, combining them with smaller fine-tuned end-to-end TOD models. Our approach outperforms previous state-of-the-art (SOTA) results.
arXiv Detail & Related papers (2023-06-16T13:04:56Z)

This list is automatically generated from the titles and abstracts of the papers in this site.