BASES: Large-scale Web Search User Simulation with Large Language Model
based Agents
- URL: http://arxiv.org/abs/2402.17505v1
- Date: Tue, 27 Feb 2024 13:44:09 GMT
- Title: BASES: Large-scale Web Search User Simulation with Large Language Model
based Agents
- Authors: Ruiyang Ren, Peng Qiu, Yingqi Qu, Jing Liu, Wayne Xin Zhao, Hua Wu,
Ji-Rong Wen, Haifeng Wang
- Abstract summary: BASES is a novel user simulation framework with large language models (LLMs)
Our simulation framework can generate unique user profiles at scale, which subsequently leads to diverse search behaviors.
WARRIORS is a new large-scale dataset encompassing web search user behaviors, including both Chinese and English versions.
- Score: 108.97507653131917
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Due to the excellent capacities of large language models (LLMs), it becomes
feasible to develop LLM-based agents for reliable user simulation. Considering
the scarcity and limit (e.g., privacy issues) of real user data, in this paper,
we conduct large-scale user simulation for web search, to improve the analysis
and modeling of user search behavior. Specially, we propose BASES, a novel user
simulation framework with LLM-based agents, designed to facilitate
comprehensive simulations of web search user behaviors. Our simulation
framework can generate unique user profiles at scale, which subsequently leads
to diverse search behaviors. To demonstrate the effectiveness of BASES, we
conduct evaluation experiments based on two human benchmarks in both Chinese
and English, demonstrating that BASES can effectively simulate large-scale
human-like search behaviors. To further accommodate the research on web search,
we develop WARRIORS, a new large-scale dataset encompassing web search user
behaviors, including both Chinese and English versions, which can greatly
bolster research in the field of information retrieval. Our code and data will
be publicly released soon.
Related papers
- Specializing Large Language Models to Simulate Survey Response Distributions for Global Populations [49.908708778200115]
We are the first to specialize large language models (LLMs) for simulating survey response distributions.
As a testbed, we use country-level results from two global cultural surveys.
We devise a fine-tuning method based on first-token probabilities to minimize divergence between predicted and actual response distributions.
arXiv Detail & Related papers (2025-02-10T21:59:27Z) - Synthesizing Post-Training Data for LLMs through Multi-Agent Simulation [51.20656279478878]
MATRIX is a multi-agent simulator that automatically generates diverse text-based scenarios.
We introduce MATRIX-Gen for controllable and highly realistic data synthesis.
On AlpacaEval 2 and Arena-Hard benchmarks, Llama-3-8B-Base, post-trained on datasets synthesized by MATRIX-Gen with just 20K instruction-response pairs, outperforms Meta's Llama-3-8B-Instruct model.
arXiv Detail & Related papers (2024-10-18T08:01:39Z) - GenSim: A General Social Simulation Platform with Large Language Model based Agents [111.00666003559324]
We propose a novel large language model (LLMs)-based simulation platform called textitGenSim.
Our platform supports one hundred thousand agents to better simulate large-scale populations in real-world contexts.
To our knowledge, GenSim represents an initial step toward a general, large-scale, and correctable social simulation platform.
arXiv Detail & Related papers (2024-10-06T05:02:23Z) - Agentic Society: Merging skeleton from real world and texture from Large Language Model [4.740886789811429]
This paper explores a novel framework that leverages census data and large language models to generate virtual populations.
We show that our method produces personas with variability essential for simulating diverse human behaviors in social science experiments.
But the evaluation result shows that only weak sign of statistical truthfulness can be produced due to limited capability of current LLMs.
arXiv Detail & Related papers (2024-09-02T08:28:19Z) - How Reliable is Your Simulator? Analysis on the Limitations of Current LLM-based User Simulators for Conversational Recommendation [14.646529557978512]
We analyze the limitations of using Large Language Models in constructing user simulators for Conversational Recommender System.
Data leakage, which occurs in conversational history and the user simulator's replies, results in inflated evaluation results.
We propose SimpleUserSim, employing a straightforward strategy to guide the topic toward the target items.
arXiv Detail & Related papers (2024-03-25T04:21:06Z) - USimAgent: Large Language Models for Simulating Search Users [33.17004578463697]
We introduce a Large Language Models-based user search behavior simulator, USimAgent.
The simulator can simulate users' querying, clicking, and stopping behaviors during search.
Empirical investigation on a real user behavior dataset shows that the simulator outperforms existing methods in query generation.
arXiv Detail & Related papers (2024-03-14T07:40:54Z) - Leveraging LLMs for Synthesizing Training Data Across Many Languages in Multilingual Dense Retrieval [56.65147231836708]
We develop SWIM-IR, a synthetic retrieval training dataset containing 33 languages for fine-tuning multilingual dense retrievers.
SAP assists the large language model (LLM) in generating informative queries in the target language.
Our models, called SWIM-X, are competitive with human-supervised dense retrieval models.
arXiv Detail & Related papers (2023-11-10T00:17:10Z) - User Behavior Simulation with Large Language Model based Agents [116.74368915420065]
We propose an LLM-based agent framework and design a sandbox environment to simulate real user behaviors.
Based on extensive experiments, we find that the simulated behaviors of our method are very close to the ones of real humans.
arXiv Detail & Related papers (2023-06-05T02:58:35Z) - Imitate TheWorld: A Search Engine Simulation Platform [13.011052642314421]
We build a simulated search engine AESim that can properly give feedback by a well-trained discriminator for generated pages.
Different from previous simulation platforms which lose connection with the real world, ours depends on the real data in Search.
Our experiments also show AESim can better reflect the online performance of ranking models than classic ranking metrics.
arXiv Detail & Related papers (2021-07-16T03:55:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.