Do LLMs Track Public Opinion? A Multi-Model Study of Favorability Predictions in the 2024 U.S. Presidential Election
- URL: http://arxiv.org/abs/2602.06302v1
- Date: Fri, 06 Feb 2026 01:52:13 GMT
- Title: Do LLMs Track Public Opinion? A Multi-Model Study of Favorability Predictions in the 2024 U.S. Presidential Election
- Authors: Riya Parikh, Sarah H. Cen, Chara Podimata,
- Abstract summary: We investigate whether Large Language Models (LLMs) can track public opinion as measured by exit polls during the 2024 U.S. presidential election cycle.<n>We evaluate predictions from nine LLM configurations against a curated set of five high-quality polls from major organizations including Reuters, CNN, Gallup, Quinnipiac, and ABC.<n>We conclude that off-the-shelf LLMs do not reliably track polls when queried in a straightforward manner and discuss implications for election forecasting.
- Score: 5.071155887901222
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We investigate whether Large Language Models (LLMs) can track public opinion as measured by exit polls during the 2024 U.S. presidential election cycle. Our analysis focuses on headline favorability (e.g., "Favorable" vs. "Unfavorable") of presidential candidates across multiple LLMs queried daily throughout the election season. Using the publicly available llm-election-data-2024 dataset, we evaluate predictions from nine LLM configurations against a curated set of five high-quality polls from major organizations including Reuters, CNN, Gallup, Quinnipiac, and ABC. We find systematic directional miscalibration. For Kamala Harris, all models overpredict favorability by 10-40% relative to polls. For Donald Trump, biases are smaller (5-10%) and poll-dependent, with substantially lower cross-model variation. These deviations persist under temporal smoothing and are not corrected by internet-augmented retrieval. We conclude that off-the-shelf LLMs do not reliably track polls when queried in a straightforward manner and discuss implications for election forecasting.
Related papers
- From Keywords to Clusters: AI-Driven Analysis of YouTube Comments to Reveal Election Issue Salience in 2024 [1.521610318673192]
Immigration and democracy were the most frequently and consistently invoked issues in user comments on the analyzed YouTube videos.<n>These results corroborate certain findings of post-election surveys but also refute the supposed importance of inflation as an election issue.
arXiv Detail & Related papers (2025-10-09T06:02:10Z) - Hearing the Order: Investigating Selection Bias in Large Audio-Language Models [51.69003519291754]
Large audio-language models (LALMs) are often used in tasks that involve reasoning over ordered options.<n>In this paper, we identify and analyze this problem in LALMs.
arXiv Detail & Related papers (2025-10-01T08:00:58Z) - Large-Scale, Longitudinal Study of Large Language Models During the 2024 US Election Season [43.092041950140164]
The 2024 US presidential election is the first major contest to occur in the US since the popularization of large language models (LLMs)<n>This moment raises urgent questions about how LLMs may shape the information ecosystem and influence political discourse.<n>We conduct a large-scale, longitudinal study of 12 models, queried using a structured survey with over 12,000 questions on a near-daily cadence from July through November 2024.
arXiv Detail & Related papers (2025-09-22T22:04:19Z) - A Large-Scale Simulation on Large Language Models for Decision-Making in Political Science [18.521101885334673]
We develop a theory-driven, multi-step reasoning framework to simulate voter decision-making at scale.<n>We conduct large-scale simulations of recent U.S. presidential elections using synthetic personas calibrated to real-world voter data.
arXiv Detail & Related papers (2024-12-19T07:10:51Z) - Towards More Accurate US Presidential Election via Multi-step Reasoning with Large Language Models [12.582222782098587]
Election prediction poses unique challenges, such as limited voter-level data, rapidly changing political landscapes, and the need to model complex human behavior.<n>We introduce a multi-step reasoning framework designed for political analysis.<n>Our approach is validated on real-world data from the American National Election Studies (ANES) 2016 and 2020.
arXiv Detail & Related papers (2024-10-21T06:18:53Z) - Vox Populi, Vox AI? Using Language Models to Estimate German Public Opinion [45.84205238554709]
We generate a synthetic sample of personas matching the individual characteristics of the 2017 German Longitudinal Election Study respondents.
We ask the LLM GPT-3.5 to predict each respondent's vote choice and compare these predictions to the survey-based estimates.
We find that GPT-3.5 does not predict citizens' vote choice accurately, exhibiting a bias towards the Green and Left parties.
arXiv Detail & Related papers (2024-07-11T14:52:18Z) - Election Polls on Social Media: Prevalence, Biases, and Voter Fraud Beliefs [5.772751069162341]
This study focuses on the 2020 presidential elections in the U.S.
We find that Twitter polls are disproportionately authored by older males and exhibit a large bias towards candidate Donald Trump.
We also find that Twitter accounts participating in election polls are more likely to be bots, and election poll outcomes tend to be more biased, before the election day than after.
arXiv Detail & Related papers (2024-05-18T02:29:35Z) - Whose Side Are You On? Investigating the Political Stance of Large Language Models [56.883423489203786]
We investigate the political orientation of Large Language Models (LLMs) across a spectrum of eight polarizing topics.
Our investigation delves into the political alignment of LLMs across a spectrum of eight polarizing topics, spanning from abortion to LGBTQ issues.
The findings suggest that users should be mindful when crafting queries, and exercise caution in selecting neutral prompt language.
arXiv Detail & Related papers (2024-03-15T04:02:24Z) - Large Language Models Are Not Robust Multiple Choice Selectors [117.72712117510953]
Multiple choice questions (MCQs) serve as a common yet important task format in the evaluation of large language models (LLMs)
This work shows that modern LLMs are vulnerable to option position changes due to their inherent "selection bias"
We propose a label-free, inference-time debiasing method, called PriDe, which separates the model's prior bias for option IDs from the overall prediction distribution.
arXiv Detail & Related papers (2023-09-07T17:44:56Z) - Electoral Forecasting Using a Novel Temporal Attenuation Model:
Predicting the US Presidential Elections [91.3755431537592]
We develop a novel macro-scale temporal attenuation (TA) model, which uses pre-election poll data to improve forecasting accuracy.
Our hypothesis is that the timing of publicizing opinion polls plays a significant role in how opinion oscillates, especially right before elections.
We present two different implementations of the TA model, which accumulate an average forecasting error of 2.8-3.28 points over the 48-year period.
arXiv Detail & Related papers (2020-04-30T09:21:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.