How Twitter Data Sampling Biases U.S. Voter Behavior Characterizations
- URL: http://arxiv.org/abs/2006.01447v1
- Date: Tue, 2 Jun 2020 08:33:30 GMT
- Title: How Twitter Data Sampling Biases U.S. Voter Behavior Characterizations
- Authors: Kai-Cheng Yang, Pik-Mai Hui, Filippo Menczer
- Abstract summary: Recent studies reveal the existence of inauthentic actors such as malicious social bots and trolls.
In this paper, we aim to close this gap using Twitter data from the 2018 U.S. midterm elections.
We show that hyperactive accounts are more likely to exhibit various suspicious behaviors and share low-credibility information.
- Score: 6.364128212193265
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Online social media are key platforms for the public to discuss political
issues. As a result, researchers have used data from these platforms to analyze
public opinions and forecast election results. Recent studies reveal the
existence of inauthentic actors such as malicious social bots and trolls,
suggesting that not every message is a genuine expression from a legitimate
user. However, the prevalence of inauthentic activities in social data streams
is still unclear, making it difficult to gauge biases of analyses based on such
data. In this paper, we aim to close this gap using Twitter data from the 2018
U.S. midterm elections. Hyperactive accounts are over-represented in volume
samples. We compare their characteristics with those of randomly sampled
accounts and self-identified voters using a fast and low-cost heuristic. We
show that hyperactive accounts are more likely to exhibit various suspicious
behaviors and share low-credibility information compared to likely voters.
Random accounts are more similar to likely voters, although they have slightly
higher chances to display suspicious behaviors. Our work provides insights into
biased voter characterizations when using online observations, underlining the
importance of accounting for inauthentic actors in studies of political issues
based on social media data.
Related papers
- On the Use of Proxies in Political Ad Targeting [49.61009579554272]
We show that major political advertisers circumvented mitigations by targeting proxy attributes.
Our findings have crucial implications for the ongoing discussion on the regulation of political advertising.
arXiv Detail & Related papers (2024-10-18T17:15:13Z) - Analyzing and Estimating Support for U.S. Presidential Candidates in Twitter Polls [1.71952017922628]
We examine nearly two thousand Twitter polls gauging support for U.S. presidential candidates during the 2016 and 2020 election campaigns.
Our findings reveal that Twitter polls are biased in various ways, starting from the position of the presidential candidates.
The 2016 and 2020 polls were predominantly crafted by older males and manifested a pronounced bias favoring candidate Donald Trump.
arXiv Detail & Related papers (2024-06-05T14:57:29Z) - Election Polls on Social Media: Prevalence, Biases, and Voter Fraud Beliefs [5.772751069162341]
This study focuses on the 2020 presidential elections in the U.S.
We find that Twitter polls are disproportionately authored by older males and exhibit a large bias towards candidate Donald Trump.
We also find that Twitter accounts participating in election polls are more likely to be bots, and election poll outcomes tend to be more biased, before the election day than after.
arXiv Detail & Related papers (2024-05-18T02:29:35Z) - Unveiling the Hidden Agenda: Biases in News Reporting and Consumption [59.55900146668931]
We build a six-year dataset on the Italian vaccine debate and adopt a Bayesian latent space model to identify narrative and selection biases.
We found a nonlinear relationship between biases and engagement, with higher engagement for extreme positions.
Analysis of news consumption on Twitter reveals common audiences among news outlets with similar ideological positions.
arXiv Detail & Related papers (2023-01-14T18:58:42Z) - Design and analysis of tweet-based election models for the 2021 Mexican
legislative election [55.41644538483948]
We use a dataset of 15 million election-related tweets in the six months preceding election day.
We find that models using data with geographical attributes determine the results of the election with better precision and accuracy than conventional polling methods.
arXiv Detail & Related papers (2023-01-02T12:40:05Z) - Fast Few shot Self-attentive Semi-supervised Political Inclination
Prediction [12.472629584751509]
It is increasingly common now for policymakers/journalists to create online polls on social media to understand the political leanings of people in specific locations.
We introduce a self-attentive semi-supervised framework for political inclination detection to further that objective.
We found that the model is highly efficient even in resource-constrained settings.
arXiv Detail & Related papers (2022-09-21T12:07:16Z) - Identification of Twitter Bots based on an Explainable ML Framework: the
US 2020 Elections Case Study [72.61531092316092]
This paper focuses on the design of a novel system for identifying Twitter bots based on labeled Twitter data.
Supervised machine learning (ML) framework is adopted using an Extreme Gradient Boosting (XGBoost) algorithm.
Our study also deploys Shapley Additive Explanations (SHAP) for explaining the ML model predictions.
arXiv Detail & Related papers (2021-12-08T14:12:24Z) - News consumption and social media regulations policy [70.31753171707005]
We analyze two social media that enforced opposite moderation methods, Twitter and Gab, to assess the interplay between news consumption and content regulation.
Our results show that the presence of moderation pursued by Twitter produces a significant reduction of questionable content.
The lack of clear regulation on Gab results in the tendency of the user to engage with both types of content, showing a slight preference for the questionable ones which may account for a dissing/endorsement behavior.
arXiv Detail & Related papers (2021-06-07T19:26:32Z) - Causal Understanding of Fake News Dissemination on Social Media [50.4854427067898]
We argue that it is critical to understand what user attributes potentially cause users to share fake news.
In fake news dissemination, confounders can be characterized by fake news sharing behavior that inherently relates to user attributes and online activities.
We propose a principled approach to alleviating selection bias in fake news dissemination.
arXiv Detail & Related papers (2020-10-20T19:37:04Z) - Inferring Political Preferences from Twitter [0.0]
Political Sentiment Analysis of social media helps the political strategists to scrutinize the performance of a party or candidate.
During the time of elections, the social networks get flooded with blogs, chats, debates and discussions about the prospects of political parties and politicians.
In this work, we chose to identify the inclination of political opinions present in Tweets by modelling it as a text classification problem using classical machine learning.
arXiv Detail & Related papers (2020-07-21T05:20:43Z) - Neutral bots probe political bias on social media [7.41821251168122]
We deploy neutral social bots who start following different news sources on Twitter to probe distinct biases emerging from platform mechanisms versus user interactions.
We find no strong or consistent evidence of political bias in the news feed.
The interactions of conservative accounts are skewed toward the right, whereas liberal accounts are exposed to moderate content shifting their experience toward the political center.
arXiv Detail & Related papers (2020-05-17T01:20:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.