Higher-Order Binding of Language Model Virtual Personas: a Study on Approximating Political Partisan Misperceptions
- URL: http://arxiv.org/abs/2504.11673v1
- Date: Wed, 16 Apr 2025 00:10:34 GMT
- Title: Higher-Order Binding of Language Model Virtual Personas: a Study on Approximating Political Partisan Misperceptions
- Authors: Minwoo Kang, Suhong Moon, Seung Hyeong Lee, Ayush Raj, Joseph Suh, David M. Chan,
- Abstract summary: Large language models (LLMs) are increasingly capable of simulating human behavior.<n>We propose a novel methodology for constructing virtual personas with synthetic user backstories" generated as extended, multi-turn interview transcripts.<n>Our generated backstories are longer, rich in detail, and consistent in authentically describing a singular individual, compared to previous methods.
- Score: 4.234771450043289
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large language models (LLMs) are increasingly capable of simulating human behavior, offering cost-effective ways to estimate user responses during the early phases of survey design. While previous studies have examined whether models can reflect individual opinions or attitudes, we argue that a \emph{higher-order} binding of virtual personas requires successfully approximating not only the opinions of a user as an identified member of a group, but also the nuanced ways in which that user perceives and evaluates those outside the group. In particular, faithfully simulating how humans perceive different social groups is critical for applying LLMs to various political science studies, including timely topics on polarization dynamics, inter-group conflict, and democratic backsliding. To this end, we propose a novel methodology for constructing virtual personas with synthetic user ``backstories" generated as extended, multi-turn interview transcripts. Our generated backstories are longer, rich in detail, and consistent in authentically describing a singular individual, compared to previous methods. We show that virtual personas conditioned on our backstories closely replicate human response distributions (up to an 87\% improvement as measured by Wasserstein Distance) and produce effect sizes that closely match those observed in the original studies. Altogether, our work extends the applicability of LLMs beyond estimating individual self-opinions, enabling their use in a broader range of human studies.
Related papers
- Multi-turn Evaluation of Anthropomorphic Behaviours in Large Language Models [26.333097337393685]
The tendency of users to anthropomorphise large language models (LLMs) is of growing interest to AI developers, researchers, and policy-makers.
Here, we present a novel method for empirically evaluating anthropomorphic LLM behaviours in realistic and varied settings.
First, we develop a multi-turn evaluation of 14 anthropomorphic behaviours.
Second, we present a scalable, automated approach by employing simulations of user interactions.
Third, we conduct an interactive, large-scale human subject study (N=1101) to validate that the model behaviours we measure predict real users' anthropomorphic perceptions.
arXiv Detail & Related papers (2025-02-10T22:09:57Z) - PersLLM: A Personified Training Approach for Large Language Models [66.16513246245401]
We propose PersLLM, integrating psychology-grounded principles of personality: social practice, consistency, and dynamic development.
We incorporate personality traits directly into the model parameters, enhancing the model's resistance to induction, promoting consistency, and supporting the dynamic evolution of personality.
arXiv Detail & Related papers (2024-07-17T08:13:22Z) - Virtual Personas for Language Models via an Anthology of Backstories [5.2112564466740245]
"Anthology" is a method for conditioning large language models to particular virtual personas by harnessing open-ended life narratives.
We show that our methodology enhances the consistency and reliability of experimental outcomes while ensuring better representation of diverse sub-populations.
arXiv Detail & Related papers (2024-07-09T06:11:18Z) - A Survey on Human Preference Learning for Large Language Models [81.41868485811625]
The recent surge of versatile large language models (LLMs) largely depends on aligning increasingly capable foundation models with human intentions by preference learning.
This survey covers the sources and formats of preference feedback, the modeling and usage of preference signals, as well as the evaluation of the aligned LLMs.
arXiv Detail & Related papers (2024-06-17T03:52:51Z) - Scaling Data Diversity for Fine-Tuning Language Models in Human Alignment [84.32768080422349]
Alignment with human preference prevents large language models from generating misleading or toxic content.
We propose a new formulation of prompt diversity, implying a linear correlation with the final performance of LLMs after fine-tuning.
arXiv Detail & Related papers (2024-03-17T07:08:55Z) - On the steerability of large language models toward data-driven personas [98.9138902560793]
Large language models (LLMs) are known to generate biased responses where the opinions of certain groups and populations are underrepresented.
Here, we present a novel approach to achieve controllable generation of specific viewpoints using LLMs.
arXiv Detail & Related papers (2023-11-08T19:01:13Z) - Do LLMs exhibit human-like response biases? A case study in survey
design [66.1850490474361]
We investigate the extent to which large language models (LLMs) reflect human response biases, if at all.
We design a dataset and framework to evaluate whether LLMs exhibit human-like response biases in survey questionnaires.
Our comprehensive evaluation of nine models shows that popular open and commercial LLMs generally fail to reflect human-like behavior.
arXiv Detail & Related papers (2023-11-07T15:40:43Z) - Bridging the Gap: A Survey on Integrating (Human) Feedback for Natural
Language Generation [68.9440575276396]
This survey aims to provide an overview of the recent research that has leveraged human feedback to improve natural language generation.
First, we introduce an encompassing formalization of feedback, and identify and organize existing research into a taxonomy following this formalization.
Second, we discuss how feedback can be described by its format and objective, and cover the two approaches proposed to use feedback (either for training or decoding): directly using the feedback or training feedback models.
Third, we provide an overview of the nascent field of AI feedback, which exploits large language models to make judgments based on a set of principles and minimize the need for
arXiv Detail & Related papers (2023-05-01T17:36:06Z) - Out of One, Many: Using Language Models to Simulate Human Samples [3.278541277919869]
We show that the "algorithmic bias" within one such tool -- the GPT-3 language model -- is both fine-grained and demographically correlated.
We create "silicon samples" by conditioning the model on thousands of socio-demographic backstories from real human participants.
arXiv Detail & Related papers (2022-09-14T19:53:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.