Two-Stage Classifier for Campaign Negativity Detection using Axis
Embeddings: A Case Study on Tweets of Political Users during 2021
Presidential Election in Iran
- URL: http://arxiv.org/abs/2311.00143v1
- Date: Tue, 31 Oct 2023 20:31:41 GMT
- Title: Two-Stage Classifier for Campaign Negativity Detection using Axis
Embeddings: A Case Study on Tweets of Political Users during 2021
Presidential Election in Iran
- Authors: Fatemeh Rajabi and Ali Mohades
- Abstract summary: In elections around the world, the candidates may turn their campaigns toward negativity due to the prospect of failure and time pressure.
We propose a hybrid model for detecting campaign negativity consisting of a two-stage classifier that combines the strengths of two machine learning models.
Our best model (RF-RF) was able to achieve 79% for the macro F1 score and 82% for the weighted F1 score.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In elections around the world, the candidates may turn their campaigns toward
negativity due to the prospect of failure and time pressure. In the digital
age, social media platforms such as Twitter are rich sources of political
discourse. Therefore, despite the large amount of data that is published on
Twitter, the automatic system for campaign negativity detection can play an
essential role in understanding the strategy of candidates and parties in their
campaigns. In this paper, we propose a hybrid model for detecting campaign
negativity consisting of a two-stage classifier that combines the strengths of
two machine learning models. Here, we have collected Persian tweets from 50
political users, including candidates and government officials. Then we
annotated 5,100 of them that were published during the year before the 2021
presidential election in Iran. In the proposed model, first, the required
datasets of two classifiers based on the cosine similarity of tweet embeddings
with axis embeddings (which are the average of embedding in positive and
negative classes of tweets) from the training set (85\%) are made, and then
these datasets are considered the training set of the two classifiers in the
hybrid model. Finally, our best model (RF-RF) was able to achieve 79\% for the
macro F1 score and 82\% for the weighted F1 score. By running the best model on
the rest of the tweets of 50 political users that were published one year
before the election and with the help of statistical models, we find that the
publication of a tweet by a candidate has nothing to do with the negativity of
that tweet, and the presence of the names of political persons and political
organizations in the tweet is directly related to its negativity.
Related papers
- On the Use of Proxies in Political Ad Targeting [49.61009579554272]
We show that major political advertisers circumvented mitigations by targeting proxy attributes.
Our findings have crucial implications for the ongoing discussion on the regulation of political advertising.
arXiv Detail & Related papers (2024-10-18T17:15:13Z) - Representation Bias in Political Sample Simulations with Large Language Models [54.48283690603358]
This study seeks to identify and quantify biases in simulating political samples with Large Language Models.
Using the GPT-3.5-Turbo model, we leverage data from the American National Election Studies, German Longitudinal Election Study, Zuobiao dataset, and China Family Panel Studies.
arXiv Detail & Related papers (2024-07-16T05:52:26Z) - Context-Based Tweet Engagement Prediction [0.0]
This thesis investigates how well context alone may be used to predict tweet engagement likelihood.
We employed the Spark engine on TU Wien's Little Big Data Cluster to create scalable data preprocessing, feature engineering, feature selection, and machine learning pipelines.
We also found that factors such as the prediction algorithm, training dataset size, training dataset sampling method, and feature selection significantly affect the results.
arXiv Detail & Related papers (2023-09-28T08:36:57Z) - Electoral Agitation Data Set: The Use Case of the Polish Election [3.671887117122512]
We present the first publicly open data set for detecting electoral agitation in the Polish language.
It contains 6,112 human-annotated tweets tagged with four legally conditioned categories.
The newly created data set was used to fine-tune a Polish Language Model called HerBERT.
arXiv Detail & Related papers (2023-07-13T18:14:43Z) - Computational Assessment of Hyperpartisanship in News Titles [55.92100606666497]
We first adopt a human-guided machine learning framework to develop a new dataset for hyperpartisan news title detection.
Overall the Right media tends to use proportionally more hyperpartisan titles.
We identify three major topics including foreign issues, political systems, and societal issues that are suggestive of hyperpartisanship in news titles.
arXiv Detail & Related papers (2023-01-16T05:56:58Z) - Design and analysis of tweet-based election models for the 2021 Mexican
legislative election [55.41644538483948]
We use a dataset of 15 million election-related tweets in the six months preceding election day.
We find that models using data with geographical attributes determine the results of the election with better precision and accuracy than conventional polling methods.
arXiv Detail & Related papers (2023-01-02T12:40:05Z) - Twitter-COMMs: Detecting Climate, COVID, and Military Multimodal
Misinformation [83.2079454464572]
This paper describes our approach to the Image-Text Inconsistency Detection challenge of the DARPA Semantic Forensics (SemaFor) Program.
We collect Twitter-COMMs, a large-scale multimodal dataset with 884k tweets relevant to the topics of Climate Change, COVID-19, and Military Vehicles.
We train our approach, based on the state-of-the-art CLIP model, leveraging automatically generated random and hard negatives.
arXiv Detail & Related papers (2021-12-16T03:37:20Z) - Shifting Polarization and Twitter News Influencers between two U.S.
Presidential Elections [92.33485580547801]
We analyze the change of polarization between the 2016 and 2020 U.S. presidential elections.
Most of the top influencers were affiliated with media organizations during both elections.
75% of the top influencers in 2020 were not present in 2016, demonstrating that such status is difficult to retain.
arXiv Detail & Related papers (2021-11-03T20:08:54Z) - Prediction of Political Leanings of Chinese Speaking Twitter Users [0.0]
It firstly collects data by scraping tweets of famous political figure and their related users.
It secondly defines the political spectrum in two groups: the group that shows approvals to the Chinese Communist Party and the group that does not.
It produces a classification model with high accuracy for understanding users' political stances from their tweets on Twitter.
arXiv Detail & Related papers (2021-10-12T03:18:10Z) - Political Advertising Dataset: the use case of the Polish 2020
Presidential Elections [4.560033258611709]
We present the first publicly open dataset for detecting specific text chunks and categories of political advertising in the Polish language.
It contains 1,705 human-annotated tweets tagged with nine categories, which constitute campaigning under Polish electoral law.
arXiv Detail & Related papers (2020-06-17T23:58:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.