Related papers: What Persona Are We Missing? Identifying Unknown Relevant Personas for Faithful User Simulation

What Persona Are We Missing? Identifying Unknown Relevant Personas for Faithful User Simulation

URL: http://arxiv.org/abs/2602.15832v1
Date: Sat, 03 Jan 2026 16:22:00 GMT
Title: What Persona Are We Missing? Identifying Unknown Relevant Personas for Faithful User Simulation
Authors: Weiwen Su, Yuhan Zhou, Zihan Wang, Naoki Yoshinaga, Masashi Toyoda,
Abstract summary: Existing user simulations, where models generate user-like responses in dialogue, often lack verification that sufficient user personas are provided.<n>This work explores the task of identifying relevant but unknown personas of the simulation target for a given simulation context.<n>We introduce PICQ, a novel dataset of context-aware choice questions, annotated with unknown personas.
Score: 16.797868883640255
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Existing user simulations, where models generate user-like responses in dialogue, often lack verification that sufficient user personas are provided, questioning the validity of the simulations. To address this core concern, this work explores the task of identifying relevant but unknown personas of the simulation target for a given simulation context. We introduce PICQ, a novel dataset of context-aware choice questions, annotated with unknown personas (e.g., ''Is the user price-sensitive?'') that may influence user choices, and propose a multi-faceted evaluation scheme assessing fidelity, influence, and inaccessibility. Our benchmark of leading LLMs reveals a complex ''Fidelity vs. Insight'' dilemma governed by model scale: while influence generally scales with model size, fidelity to human patterns follows an inverted U-shaped curve. We trace this phenomenon to cognitive differences, particularly the human tendency for ''cognitive economy.'' Our work provides the first comprehensive benchmark for this crucial task, offering a new lens for understanding the divergent cognitive models of humans and advanced LLMs.

Related papers

HumanLLM: Towards Personalized Understanding and Simulation of Human Nature [72.55730315685837]
HumanLLM is a foundation model designed for personalized understanding and simulation of individuals.<n>We first construct the Cognitive Genome, a large-scale corpus curated from real-world user data on platforms like Reddit, Twitter, Blogger, and Amazon.<n>We then formulate diverse learning tasks and perform supervised fine-tuning to empower the model to predict a wide range of individualized human behaviors, thoughts, and experiences.
arXiv Detail & Related papers (2026-01-22T09:27:27Z)
See, Think, Act: Online Shopper Behavior Simulation with VLM Agents [58.92444959954643]
This paper investigates the integration of visual information, specifically webpage screenshots, into behavior simulation via VLMs.<n>We employ SFT for joint action prediction and rationale generation, conditioning on the full interaction context.<n>To further enhance reasoning capabilities, we integrate RL with a hierarchical reward structure, scaled by a difficulty-aware factor.
arXiv Detail & Related papers (2025-10-22T05:07:14Z)
SimBench: Benchmarking the Ability of Large Language Models to Simulate Human Behaviors [58.87134689752605]
We introduce SimBench, the first large-scale, standardized benchmark for a robust, reproducible science of LLM simulation.<n>We show that even the best LLMs today have limited simulation ability (score: 40.80/100), performance scales log-linearly with model size.<n>We demonstrate that simulation ability correlates most strongly with deep, knowledge-intensive reasoning.
arXiv Detail & Related papers (2025-10-20T13:14:38Z)
DeceptionBench: A Comprehensive Benchmark for AI Deception Behaviors in Real-world Scenarios [57.327907850766785]
characterization of deception across realistic real-world scenarios remains underexplored.<n>We establish DeceptionBench, the first benchmark that systematically evaluates how deceptive tendencies manifest across different domains.<n>On the intrinsic dimension, we explore whether models exhibit self-interested egoistic tendencies or sycophantic behaviors that prioritize user appeasement.<n>We incorporate sustained multi-turn interaction loops to construct a more realistic simulation of real-world feedback dynamics.
arXiv Detail & Related papers (2025-10-17T10:14:26Z)
Human vs. Agent in Task-Oriented Conversations [22.743152820695588]
This work presents the first systematic comparison between large language models (LLMs)-simulated users and human users in personalized task-oriented conversations.<n>Our analysis reveals significant behavioral differences between the two user types in problem-solving approaches.
arXiv Detail & Related papers (2025-09-22T11:30:39Z)
Preference Learning for AI Alignment: a Causal Perspective [55.2480439325792]
We frame this problem in a causal paradigm, providing the rich toolbox of causality to identify persistent challenges.<n>Inheriting from the literature of causal inference, we identify key assumptions necessary for reliable generalisation.<n>We illustrate failure modes of naive reward models and demonstrate how causally-inspired approaches can improve model robustness.
arXiv Detail & Related papers (2025-06-06T10:45:42Z)
Multi-turn Evaluation of Anthropomorphic Behaviours in Large Language Models [20.077019480409657]
The tendency of users to anthropomorphise large language models (LLMs) is of growing interest to AI developers, researchers, and policy-makers.<n>Here, we present a novel method for empirically evaluating anthropomorphic LLM behaviours in realistic and varied settings.<n>First, we develop a multi-turn evaluation of 14 anthropomorphic behaviours.<n>Second, we present a scalable, automated approach by employing simulations of user interactions.<n>Third, we conduct an interactive, large-scale human subject study (N=1101) to validate that the model behaviours we measure predict real users' anthropomorphic perceptions.
arXiv Detail & Related papers (2025-02-10T22:09:57Z)
User Behavior Simulation with Large Language Model based Agents [116.74368915420065]
We propose an LLM-based agent framework and design a sandbox environment to simulate real user behaviors. Based on extensive experiments, we find that the simulated behaviors of our method are very close to the ones of real humans.
arXiv Detail & Related papers (2023-06-05T02:58:35Z)
User Ex Machina : Simulation as a Design Probe in Human-in-the-Loop Text Analytics [29.552736183006672]
We conduct a simulation-based analysis of human-centered interactions with topic models. We find that user interactions have impacts that differ in magnitude but often negatively affect the quality of the resulting modelling.
arXiv Detail & Related papers (2021-01-06T19:44:11Z)

This list is automatically generated from the titles and abstracts of the papers in this site.