Out of One, Many: Using Language Models to Simulate Human Samples
- URL: http://arxiv.org/abs/2209.06899v1
- Date: Wed, 14 Sep 2022 19:53:32 GMT
- Title: Out of One, Many: Using Language Models to Simulate Human Samples
- Authors: Lisa P. Argyle, Ethan C. Busby, Nancy Fulda, Joshua Gubler,
Christopher Rytting, David Wingate
- Abstract summary: We show that the "algorithmic bias" within one such tool -- the GPT-3 language model -- is both fine-grained and demographically correlated.
We create "silicon samples" by conditioning the model on thousands of socio-demographic backstories from real human participants.
- Score: 3.278541277919869
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: We propose and explore the possibility that language models can be studied as
effective proxies for specific human sub-populations in social science
research. Practical and research applications of artificial intelligence tools
have sometimes been limited by problematic biases (such as racism or sexism),
which are often treated as uniform properties of the models. We show that the
"algorithmic bias" within one such tool -- the GPT-3 language model -- is
instead both fine-grained and demographically correlated, meaning that proper
conditioning will cause it to accurately emulate response distributions from a
wide variety of human subgroups. We term this property "algorithmic fidelity"
and explore its extent in GPT-3. We create "silicon samples" by conditioning
the model on thousands of socio-demographic backstories from real human
participants in multiple large surveys conducted in the United States. We then
compare the silicon and human samples to demonstrate that the information
contained in GPT-3 goes far beyond surface similarity. It is nuanced,
multifaceted, and reflects the complex interplay between ideas, attitudes, and
socio-cultural context that characterize human attitudes. We suggest that
language models with sufficient algorithmic fidelity thus constitute a novel
and powerful tool to advance understanding of humans and society across a
variety of disciplines.
Related papers
- Towards "Differential AI Psychology" and in-context Value-driven Statement Alignment with Moral Foundations Theory [0.0]
This work investigates the alignment between personalized language models and survey participants on a Moral Foundation questionnaire.
We adapt text-to-text models to different political personas and survey the questionnaire repetitively to generate a synthetic population of persona and model combinations.
Our findings indicate that adapted models struggle to represent the survey-leading assessment of political ideologies.
arXiv Detail & Related papers (2024-08-21T08:20:41Z) - Do GPT Language Models Suffer From Split Personality Disorder? The Advent Of Substrate-Free Psychometrics [1.1172147007388977]
We provide a state of the art language model with the same personality questionnaire in nine languages.
Our results suggest both interlingual and intralingual instabilities, which indicate that current language models do not develop a consistent core personality.
This can lead to unsafe behaviour of artificial intelligence systems that are based on these foundation models.
arXiv Detail & Related papers (2024-08-14T08:53:00Z) - PersLLM: A Personified Training Approach for Large Language Models [66.16513246245401]
We propose PersLLM, integrating psychology-grounded principles of personality: social practice, consistency, and dynamic development.
We incorporate personality traits directly into the model parameters, enhancing the model's resistance to induction, promoting consistency, and supporting the dynamic evolution of personality.
arXiv Detail & Related papers (2024-07-17T08:13:22Z) - Representation Bias in Political Sample Simulations with Large Language Models [54.48283690603358]
This study seeks to identify and quantify biases in simulating political samples with Large Language Models.
Using the GPT-3.5-Turbo model, we leverage data from the American National Election Studies, German Longitudinal Election Study, Zuobiao dataset, and China Family Panel Studies.
arXiv Detail & Related papers (2024-07-16T05:52:26Z) - Virtual Personas for Language Models via an Anthology of Backstories [5.2112564466740245]
"Anthology" is a method for conditioning large language models to particular virtual personas by harnessing open-ended life narratives.
We show that our methodology enhances the consistency and reliability of experimental outcomes while ensuring better representation of diverse sub-populations.
arXiv Detail & Related papers (2024-07-09T06:11:18Z) - Inclusivity in Large Language Models: Personality Traits and Gender Bias in Scientific Abstracts [49.97673761305336]
We evaluate three large language models (LLMs) for their alignment with human narrative styles and potential gender biases.
Our findings indicate that, while these models generally produce text closely resembling human authored content, variations in stylistic features suggest significant gender biases.
arXiv Detail & Related papers (2024-06-27T19:26:11Z) - Random Silicon Sampling: Simulating Human Sub-Population Opinion Using a
Large Language Model Based on Group-Level Demographic Information [15.435605802794408]
Large language models exhibit societal biases associated with demographic information.
We propose "random silicon sampling," a method to emulate the opinions of the human population sub-group.
We find that language models can generate response distributions remarkably similar to the actual U.S. public opinion polls.
arXiv Detail & Related papers (2024-02-28T08:09:14Z) - Multilingual Text-to-Image Generation Magnifies Gender Stereotypes and Prompt Engineering May Not Help You [64.74707085021858]
We show that multilingual models suffer from significant gender biases just as monolingual models do.
We propose a novel benchmark, MAGBIG, intended to foster research on gender bias in multilingual models.
Our results show that not only do models exhibit strong gender biases but they also behave differently across languages.
arXiv Detail & Related papers (2024-01-29T12:02:28Z) - Stable Bias: Analyzing Societal Representations in Diffusion Models [72.27121528451528]
We propose a new method for exploring the social biases in Text-to-Image (TTI) systems.
Our approach relies on characterizing the variation in generated images triggered by enumerating gender and ethnicity markers in the prompts.
We leverage this method to analyze images generated by 3 popular TTI systems and find that while all of their outputs show correlations with US labor demographics, they also consistently under-represent marginalized identities to different extents.
arXiv Detail & Related papers (2023-03-20T19:32:49Z) - Estimating the Personality of White-Box Language Models [0.589889361990138]
Large-scale language models, which are trained on large corpora of text, are being used in a wide range of applications everywhere.
Existing research shows that these models can and do capture human biases.
Many of these biases, especially those that could potentially cause harm, are being well-investigated.
However, studies that infer and change human personality traits inherited by these models have been scarce or non-existent.
arXiv Detail & Related papers (2022-04-25T23:53:53Z) - Towards Understanding and Mitigating Social Biases in Language Models [107.82654101403264]
Large-scale pretrained language models (LMs) can be potentially dangerous in manifesting undesirable representational biases.
We propose steps towards mitigating social biases during text generation.
Our empirical results and human evaluation demonstrate effectiveness in mitigating bias while retaining crucial contextual information.
arXiv Detail & Related papers (2021-06-24T17:52:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.