Related papers: From Keywords to Clusters: AI-Driven Analysis of YouTube Comments to Reveal Election Issue Salience in 2024

From Keywords to Clusters: AI-Driven Analysis of YouTube Comments to Reveal Election Issue Salience in 2024

URL: http://arxiv.org/abs/2510.07821v1
Date: Thu, 09 Oct 2025 06:02:10 GMT
Title: From Keywords to Clusters: AI-Driven Analysis of YouTube Comments to Reveal Election Issue Salience in 2024
Authors: Raisa M. Simoes, Timoteo Kelly, Eduardo J. Simoes, Praveen Rao,
Abstract summary: Immigration and democracy were the most frequently and consistently invoked issues in user comments on the analyzed YouTube videos.<n>These results corroborate certain findings of post-election surveys but also refute the supposed importance of inflation as an election issue.
Score: 1.521610318673192
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This paper aims to explore two competing data science methodologies to attempt answering the question, "Which issues contributed most to voters' choice in the 2024 presidential election?" The methodologies involve novel empirical evidence driven by artificial intelligence (AI) techniques. By using two distinct methods based on natural language processing and clustering analysis to mine over eight thousand user comments on election-related YouTube videos from one right leaning journal, Wall Street Journal, and one left leaning journal, New York Times, during pre-election week, we quantify the frequency of selected issue areas among user comments to infer which issues were most salient to potential voters in the seven days preceding the November 5th election. Empirically, we primarily demonstrate that immigration and democracy were the most frequently and consistently invoked issues in user comments on the analyzed YouTube videos, followed by the issue of identity politics, while inflation was significantly less frequently referenced. These results corroborate certain findings of post-election surveys but also refute the supposed importance of inflation as an election issue. This indicates that variations on opinion mining, with their analysis of raw user data online, can be more revealing than polling and surveys for analyzing election outcomes.

Related papers

Large-Scale, Longitudinal Study of Large Language Models During the 2024 US Election Season [43.092041950140164]
The 2024 US presidential election is the first major contest to occur in the US since the popularization of large language models (LLMs)<n>This moment raises urgent questions about how LLMs may shape the information ecosystem and influence political discourse.<n>We conduct a large-scale, longitudinal study of 12 models, queried using a structured survey with over 12,000 questions on a near-daily cadence from July through November 2024.
arXiv Detail & Related papers (2025-09-22T22:04:19Z)
How candidates evoke identity and issues on TikTok [2.664168105033125]
We examine the final six months before the 2024 US Presidential Election to understand how major campaigns used TikTok.<n>We frame our analysis around two political science theories. The first is the expressive (identity) model, where voters are motivated by their group memberships.<n>We also examine how often candidates attacked opponents, reflecting literature showing attacks are common in politics.
arXiv Detail & Related papers (2025-08-26T13:27:42Z)
UKElectionNarratives: A Dataset of Misleading Narratives Surrounding Recent UK General Elections [4.790922259120059]
We introduce the first taxonomy of common misleading narratives that circulated during recent elections in Europe.<n>Based on this taxonomy, we construct and analyse UKElectionNarratives: the first dataset of human-annotated misleading narratives.
arXiv Detail & Related papers (2025-05-08T17:51:20Z)
Representation Bias in Political Sample Simulations with Large Language Models [54.48283690603358]
This study seeks to identify and quantify biases in simulating political samples with Large Language Models. Using the GPT-3.5-Turbo model, we leverage data from the American National Election Studies, German Longitudinal Election Study, Zuobiao dataset, and China Family Panel Studies.
arXiv Detail & Related papers (2024-07-16T05:52:26Z)
Opinion Mining from YouTube Captions Using ChatGPT: A Case Study of Street Interviews Polling the 2023 Turkish Elections [0.0]
We propose a novel approach for opinion mining, utilizing YouTube's auto-generated captions from public interviews as a data source. We introduce an opinion mining framework using ChatGPT to mass-annotate voting intentions and motivations. We report that ChatGPT can predict the preferred candidate with 97% accuracy and identify the correct voting motivation out of 13 possible choices with 71% accuracy based on the data collected from 325 interviews.
arXiv Detail & Related papers (2023-04-07T01:25:22Z)
Design and analysis of tweet-based election models for the 2021 Mexican legislative election [55.41644538483948]
We use a dataset of 15 million election-related tweets in the six months preceding election day. We find that models using data with geographical attributes determine the results of the election with better precision and accuracy than conventional polling methods.
arXiv Detail & Related papers (2023-01-02T12:40:05Z)
Novelty in news search: a longitudinal study of the 2020 US elections [62.997667081978825]
We analyze novelty, a measurement of new items that emerge in the top news search results. We find more new items emerging for election related queries compared to topical or stable queries. We argue that such imbalances affect the visibility of political candidates in news searches during electoral periods.
arXiv Detail & Related papers (2022-11-09T08:42:37Z)
Forecasting election results by studying brand importance in online news [0.0]
This study uses the semantic brand score, a novel measure of brand importance in big textual data, to forecast elections based on online news. Forecasts made for four voting events in Italy provided consistent results across different voting systems.
arXiv Detail & Related papers (2021-05-12T16:30:33Z)
The Matter of Chance: Auditing Web Search Results Related to the 2020 U.S. Presidential Primary Elections Across Six Search Engines [68.8204255655161]
We look at the text search results for "us elections", "donald trump", "joe biden" and "bernie sanders" queries on Google, Baidu, Bing, DuckDuckGo, Yahoo, and Yandex. Our findings indicate substantial differences in the search results between search engines and multiple discrepancies within the results generated for different agents.
arXiv Detail & Related papers (2021-05-03T11:18:19Z)
Mundus vult decipi, ergo decipiatur: Visual Communication of Uncertainty in Election Polls [56.8172499765118]
We discuss potential sources of bias in nowcasting and forecasting. Concepts are presented to attenuate the issue of falsely perceived accuracy. One key idea is the use of Probabilities of Events instead of party shares.
arXiv Detail & Related papers (2021-04-28T07:02:24Z)
Electoral Forecasting Using a Novel Temporal Attenuation Model: Predicting the US Presidential Elections [91.3755431537592]
We develop a novel macro-scale temporal attenuation (TA) model, which uses pre-election poll data to improve forecasting accuracy. Our hypothesis is that the timing of publicizing opinion polls plays a significant role in how opinion oscillates, especially right before elections. We present two different implementations of the TA model, which accumulate an average forecasting error of 2.8-3.28 points over the 48-year period.
arXiv Detail & Related papers (2020-04-30T09:21:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.