Can Large Language Models Capture Public Opinion about Global Warming?
An Empirical Assessment of Algorithmic Fidelity and Bias
- URL: http://arxiv.org/abs/2311.00217v2
- Date: Thu, 8 Feb 2024 03:49:46 GMT
- Title: Can Large Language Models Capture Public Opinion about Global Warming?
An Empirical Assessment of Algorithmic Fidelity and Bias
- Authors: S. Lee, T. Q. Peng, M. H. Goldberg, S. A. Rosenthal, J. E. Kotcher, E.
W. Maibach and A. Leiserowitz
- Abstract summary: Large language models (LLMs) have demonstrated their potential in social science research by emulating human perceptions and behaviors.
This study assesses the algorithmic fidelity and bias of LLMs by utilizing two nationally representative climate change surveys.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Large language models (LLMs) have demonstrated their potential in social
science research by emulating human perceptions and behaviors, a concept
referred to as algorithmic fidelity. This study assesses the algorithmic
fidelity and bias of LLMs by utilizing two nationally representative climate
change surveys. The LLMs were conditioned on demographics and/or psychological
covariates to simulate survey responses. The findings indicate that LLMs can
effectively capture presidential voting behaviors but encounter challenges in
accurately representing global warming perspectives when relevant covariates
are not included. GPT-4 exhibits improved performance when conditioned on both
demographics and covariates. However, disparities emerge in LLM estimations of
the views of certain groups, with LLMs tending to underestimate worry about
global warming among Black Americans. While highlighting the potential of LLMs
to aid social science research, these results underscore the importance of
meticulous conditioning, model selection, survey question format, and bias
assessment when employing LLMs for survey simulation. Further investigation
into prompt engineering and algorithm auditing is essential to harness the
power of LLMs while addressing their inherent limitations.
Related papers
- Hate Personified: Investigating the role of LLMs in content moderation [64.26243779985393]
For subjective tasks such as hate detection, where people perceive hate differently, the Large Language Model's (LLM) ability to represent diverse groups is unclear.
By including additional context in prompts, we analyze LLM's sensitivity to geographical priming, persona attributes, and numerical information to assess how well the needs of various groups are reflected.
arXiv Detail & Related papers (2024-10-03T16:43:17Z) - United in Diversity? Contextual Biases in LLM-Based Predictions of the 2024 European Parliament Elections [45.84205238554709]
Large language models (LLMs) are perceived by some as having the potential to revolutionize social science research.
In this study, we examine to what extent LLM-based predictions of public opinion exhibit context-dependent biases.
We predict voting behavior in the 2024 European Parliament elections using a state-of-the-art LLM.
arXiv Detail & Related papers (2024-08-29T16:01:06Z) - Vox Populi, Vox AI? Using Language Models to Estimate German Public Opinion [45.84205238554709]
We generate a synthetic sample of personas matching the individual characteristics of the 2017 German Longitudinal Election Study respondents.
We ask the LLM GPT-3.5 to predict each respondent's vote choice and compare these predictions to the survey-based estimates.
We find that GPT-3.5 does not predict citizens' vote choice accurately, exhibiting a bias towards the Green and Left parties.
arXiv Detail & Related papers (2024-07-11T14:52:18Z) - Unlearning Climate Misinformation in Large Language Models [17.95497650321137]
Misinformation regarding climate change is a key roadblock in addressing one of the most serious threats to humanity.
This paper investigates factual accuracy in large language models (LLMs) regarding climate information.
arXiv Detail & Related papers (2024-05-29T23:11:53Z) - Explaining Large Language Models Decisions Using Shapley Values [1.223779595809275]
Large language models (LLMs) have opened up exciting possibilities for simulating human behavior and cognitive processes.
However, the validity of utilizing LLMs as stand-ins for human subjects remains uncertain.
This paper presents a novel approach based on Shapley values to interpret LLM behavior and quantify the relative contribution of each prompt component to the model's output.
arXiv Detail & Related papers (2024-03-29T22:49:43Z) - Exploring Value Biases: How LLMs Deviate Towards the Ideal [57.99044181599786]
Large-Language-Models (LLMs) are deployed in a wide range of applications, and their response has an increasing social impact.
We show that value bias is strong in LLMs across different categories, similar to the results found in human studies.
arXiv Detail & Related papers (2024-02-16T18:28:43Z) - Do LLMs exhibit human-like response biases? A case study in survey
design [66.1850490474361]
We investigate the extent to which large language models (LLMs) reflect human response biases, if at all.
We design a dataset and framework to evaluate whether LLMs exhibit human-like response biases in survey questionnaires.
Our comprehensive evaluation of nine models shows that popular open and commercial LLMs generally fail to reflect human-like behavior.
arXiv Detail & Related papers (2023-11-07T15:40:43Z) - CoMPosT: Characterizing and Evaluating Caricature in LLM Simulations [61.9212914612875]
We present a framework to characterize LLM simulations using four dimensions: Context, Model, Persona, and Topic.
We use this framework to measure open-ended LLM simulations' susceptibility to caricature, defined via two criteria: individuation and exaggeration.
We find that for GPT-4, simulations of certain demographics (political and marginalized groups) and topics (general, uncontroversial) are highly susceptible to caricature.
arXiv Detail & Related papers (2023-10-17T18:00:25Z) - Assessing Large Language Models on Climate Information [5.034118180129635]
We present a comprehensive evaluation framework grounded in science communication research to assess Large Language Models (LLMs)
Our framework emphasizes both presentational responses and adequacy, offering a fine-grained analysis of LLM generations spanning 8 dimensions and 30 issues.
We introduce a novel protocol for scalable oversight that relies on AI Assistance and raters with relevant education.
arXiv Detail & Related papers (2023-10-04T16:09:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.