Electoral Agitation Data Set: The Use Case of the Polish Election
- URL: http://arxiv.org/abs/2307.07007v1
- Date: Thu, 13 Jul 2023 18:14:43 GMT
- Title: Electoral Agitation Data Set: The Use Case of the Polish Election
- Authors: Mateusz Baran, Mateusz W\'ojcik, Piotr Kolebski, Micha{\l} Bernaczyk,
Krzysztof Rajda, {\L}ukasz Augustyniak, Tomasz Kajdanowicz
- Abstract summary: We present the first publicly open data set for detecting electoral agitation in the Polish language.
It contains 6,112 human-annotated tweets tagged with four legally conditioned categories.
The newly created data set was used to fine-tune a Polish Language Model called HerBERT.
- Score: 3.671887117122512
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The popularity of social media makes politicians use it for political
advertisement. Therefore, social media is full of electoral agitation
(electioneering), especially during the election campaigns. The election
administration cannot track the spread and quantity of messages that count as
agitation under the election code. It addresses a crucial problem, while also
uncovering a niche that has not been effectively targeted so far. Hence, we
present the first publicly open data set for detecting electoral agitation in
the Polish language. It contains 6,112 human-annotated tweets tagged with four
legally conditioned categories. We achieved a 0.66 inter-annotator agreement
(Cohen's kappa score). An additional annotator resolved the mismatches between
the first two improving the consistency and complexity of the annotation
process. The newly created data set was used to fine-tune a Polish Language
Model called HerBERT (achieving a 68% F1 score). We also present a number of
potential use cases for such data sets and models, enriching the paper with an
analysis of the Polish 2020 Presidential Election on Twitter.
Related papers
- On the Use of Proxies in Political Ad Targeting [49.61009579554272]
We show that major political advertisers circumvented mitigations by targeting proxy attributes.
Our findings have crucial implications for the ongoing discussion on the regulation of political advertising.
arXiv Detail & Related papers (2024-10-18T17:15:13Z) - Two-Stage Classifier for Campaign Negativity Detection using Axis
Embeddings: A Case Study on Tweets of Political Users during 2021
Presidential Election in Iran [0.0]
In elections around the world, the candidates may turn their campaigns toward negativity due to the prospect of failure and time pressure.
We propose a hybrid model for detecting campaign negativity consisting of a two-stage classifier that combines the strengths of two machine learning models.
Our best model (RF-RF) was able to achieve 79% for the macro F1 score and 82% for the weighted F1 score.
arXiv Detail & Related papers (2023-10-31T20:31:41Z) - Prediction of the 2023 Turkish Presidential Election Results Using
Social Media Data [0.5156484100374059]
We aim to predict the vote shares of parties participating in the 2023 elections in Turkey by combining social media data with traditional polling data.
Our approach is a volume-based approach that considers the number of social media interactions rather than content.
arXiv Detail & Related papers (2023-05-28T13:17:51Z) - Design and analysis of tweet-based election models for the 2021 Mexican
legislative election [55.41644538483948]
We use a dataset of 15 million election-related tweets in the six months preceding election day.
We find that models using data with geographical attributes determine the results of the election with better precision and accuracy than conventional polling methods.
arXiv Detail & Related papers (2023-01-02T12:40:05Z) - Novelty in news search: a longitudinal study of the 2020 US elections [62.997667081978825]
We analyze novelty, a measurement of new items that emerge in the top news search results.
We find more new items emerging for election related queries compared to topical or stable queries.
We argue that such imbalances affect the visibility of political candidates in news searches during electoral periods.
arXiv Detail & Related papers (2022-11-09T08:42:37Z) - Political Communities on Twitter: Case Study of the 2022 French
Presidential Election [14.783829037950984]
We aim to identify political communities formed on Twitter during the 2022 French presidential election.
We create a large-scale Twitter dataset containing 1.2 million users and 62.6 million tweets that mention keywords relevant to the election.
We perform community detection on a retweet graph of users and propose an in-depth analysis of the stance of each community.
arXiv Detail & Related papers (2022-04-15T12:18:16Z) - Shifting Polarization and Twitter News Influencers between two U.S.
Presidential Elections [92.33485580547801]
We analyze the change of polarization between the 2016 and 2020 U.S. presidential elections.
Most of the top influencers were affiliated with media organizations during both elections.
75% of the top influencers in 2020 were not present in 2016, demonstrating that such status is difficult to retain.
arXiv Detail & Related papers (2021-11-03T20:08:54Z) - Reaching the bubble may not be enough: news media role in online
political polarization [58.720142291102135]
A way of reducing polarization would be by distributing cross-partisan news among individuals with distinct political orientations.
This study investigates whether this holds in the context of nationwide elections in Brazil and Canada.
arXiv Detail & Related papers (2021-09-18T11:34:04Z) - Mundus vult decipi, ergo decipiatur: Visual Communication of Uncertainty
in Election Polls [56.8172499765118]
We discuss potential sources of bias in nowcasting and forecasting.
Concepts are presented to attenuate the issue of falsely perceived accuracy.
One key idea is the use of Probabilities of Events instead of party shares.
arXiv Detail & Related papers (2021-04-28T07:02:24Z) - Political Advertising Dataset: the use case of the Polish 2020
Presidential Elections [4.560033258611709]
We present the first publicly open dataset for detecting specific text chunks and categories of political advertising in the Polish language.
It contains 1,705 human-annotated tweets tagged with nine categories, which constitute campaigning under Polish electoral law.
arXiv Detail & Related papers (2020-06-17T23:58:01Z) - Embeddings-Based Clustering for Target Specific Stances: The Case of a
Polarized Turkey [6.130136112098865]
We present an unsupervised method for target-specific stance detection in a polarized setting, specifically Turkish politics.
We show the effectiveness of our method in properly clustering users of divergent groups across multiple targets.
We perform our analysis on a large dataset of 108M Turkish election-related tweets along with the timeline tweets of 168k Turkish users.
arXiv Detail & Related papers (2020-05-19T13:52:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.