Using Large Language Models to Simulate Multiple Humans and Replicate
Human Subject Studies
- URL: http://arxiv.org/abs/2208.10264v5
- Date: Sun, 9 Jul 2023 18:27:27 GMT
- Title: Using Large Language Models to Simulate Multiple Humans and Replicate
Human Subject Studies
- Authors: Gati Aher, Rosa I. Arriaga, Adam Tauman Kalai
- Abstract summary: We introduce a new type of test, called a Turing Experiment (TE)
A TE can reveal consistent distortions in a language model's simulation of a specific human behavior.
We compare how well different language models are able to reproduce classic economic, psycholinguistic, and social psychology experiments.
- Score: 7.696359453385686
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We introduce a new type of test, called a Turing Experiment (TE), for
evaluating to what extent a given language model, such as GPT models, can
simulate different aspects of human behavior. A TE can also reveal consistent
distortions in a language model's simulation of a specific human behavior.
Unlike the Turing Test, which involves simulating a single arbitrary
individual, a TE requires simulating a representative sample of participants in
human subject research. We carry out TEs that attempt to replicate
well-established findings from prior studies. We design a methodology for
simulating TEs and illustrate its use to compare how well different language
models are able to reproduce classic economic, psycholinguistic, and social
psychology experiments: Ultimatum Game, Garden Path Sentences, Milgram Shock
Experiment, and Wisdom of Crowds. In the first three TEs, the existing findings
were replicated using recent models, while the last TE reveals a
"hyper-accuracy distortion" present in some language models (including ChatGPT
and GPT-4), which could affect downstream applications in education and the
arts.
Related papers
- Spontaneous Style Text-to-Speech Synthesis with Controllable Spontaneous Behaviors Based on Language Models [55.898594710420326]
We propose a novel spontaneous speech synthesis system based on language models.
Fine-grained prosody modeling is introduced to enhance the model's ability to capture subtle prosody variations in spontaneous speech.
arXiv Detail & Related papers (2024-07-18T13:42:38Z) - Toward In-Context Teaching: Adapting Examples to Students' Misconceptions [54.82965010592045]
We introduce a suite of models and evaluation methods we call AdapT.
AToM is a new probabilistic model for adaptive teaching that jointly infers students' past beliefs and optimize for the correctness of future beliefs.
Our results highlight both the difficulty of the adaptive teaching task and the potential of learned adaptive models for solving it.
arXiv Detail & Related papers (2024-05-07T17:05:27Z) - Simulating Family Conversations using LLMs: Demonstration of Parenting
Styles [0.0]
This study presents a framework for conducting psychological and linguistic research through simulated conversations using large language models (LLMs)
The proposed methodology offers significant advantages, particularly for simulating human interactions involving potential unethical language or behaviors.
arXiv Detail & Related papers (2024-03-10T09:18:43Z) - Using Artificial Populations to Study Psychological Phenomena in Neural
Models [0.0]
Investigation of cognitive behavior in language models must be conducted in an appropriate population for the results to be meaningful.
We leverage work in uncertainty estimation in a novel approach to efficiently construct experimental populations.
We provide theoretical grounding in the uncertainty estimation literature and motivation from current cognitive work regarding language models.
arXiv Detail & Related papers (2023-08-15T20:47:51Z) - Simulating H.P. Lovecraft horror literature with the ChatGPT large
language model [0.0]
We present a novel approach to simulating H.P. Lovecraft's horror literature using the ChatGPT large language model, specifically the GPT-4 architecture.
Our study aims to generate text that emulates Lovecraft's unique writing style and themes, while also examining the effectiveness of prompt engineering techniques in guiding the model's output.
arXiv Detail & Related papers (2023-05-05T11:03:03Z) - Out of One, Many: Using Language Models to Simulate Human Samples [3.278541277919869]
We show that the "algorithmic bias" within one such tool -- the GPT-3 language model -- is both fine-grained and demographically correlated.
We create "silicon samples" by conditioning the model on thousands of socio-demographic backstories from real human participants.
arXiv Detail & Related papers (2022-09-14T19:53:32Z) - Naturalistic Causal Probing for Morpho-Syntax [76.83735391276547]
We suggest a naturalistic strategy for input-level intervention on real world data in Spanish.
Using our approach, we isolate morpho-syntactic features from counfounders in sentences.
We apply this methodology to analyze causal effects of gender and number on contextualized representations extracted from pre-trained models.
arXiv Detail & Related papers (2022-05-14T11:47:58Z) - Dependency-based Mixture Language Models [53.152011258252315]
We introduce the Dependency-based Mixture Language Models.
In detail, we first train neural language models with a novel dependency modeling objective.
We then formulate the next-token probability by mixing the previous dependency modeling probability distributions with self-attention.
arXiv Detail & Related papers (2022-03-19T06:28:30Z) - Empowering Language Understanding with Counterfactual Reasoning [141.48592718583245]
We propose a Counterfactual Reasoning Model, which mimics the counterfactual thinking by learning from few counterfactual samples.
In particular, we devise a generation module to generate representative counterfactual samples for each factual sample, and a retrospective module to retrospect the model prediction by comparing the counterfactual and factual samples.
arXiv Detail & Related papers (2021-06-06T06:36:52Z) - Exploring emotional prototypes in a high dimensional TTS latent space [3.4404376509754506]
We search the prosodic latent space in a trained GST Tacotron model to explore prototypes of emotional prosody.
We demonstrate that particular regions of the model's latent space are reliably associated with particular emotions.
arXiv Detail & Related papers (2021-05-05T06:49:21Z) - Reverse Engineering Configurations of Neural Text Generation Models [86.9479386959155]
The study of artifacts that emerge in machine generated text as a result of modeling choices is a nascent research area.
We conduct an extensive suite of diagnostic tests to observe whether modeling choices leave detectable artifacts in the text they generate.
Our key finding, which is backed by a rigorous set of experiments, is that such artifacts are present and that different modeling choices can be inferred by observing the generated text alone.
arXiv Detail & Related papers (2020-04-13T21:02:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.