A Spanish dataset for Targeted Sentiment Analysis of political headlines
- URL: http://arxiv.org/abs/2208.13947v1
- Date: Tue, 30 Aug 2022 01:30:30 GMT
- Title: A Spanish dataset for Targeted Sentiment Analysis of political headlines
- Authors: Tom\'as Alves Salgueiro, Emilio Recart Zapata, Dami\'an Furman, Juan
Manuel P\'erez, Pablo Nicol\'as Fern\'andez Larrosa
- Abstract summary: This work addresses the task of Targeted Sentiment Analysis for the domain of news headlines, published by the main outlets during the 2019 Argentinean Presidential Elections.
We present a polarity dataset of 1,976 headlines mentioning candidates in the 2019 elections at the target level.
Preliminary experiments with state-of-the-art classification algorithms based on pre-trained linguistic models suggest that target information is helpful for this task.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Subjective texts have been studied by several works as they can induce
certain behaviours in their users. Most work focuses on user-generated texts in
social networks, but some other texts also comprise opinions on certain topics
and could influence judgement criteria during political decisions. In this
work, we address the task of Targeted Sentiment Analysis for the domain of news
headlines, published by the main outlets during the 2019 Argentinean
Presidential Elections. For this purpose, we present a polarity dataset of
1,976 headlines mentioning candidates in the 2019 elections at the target
level. Preliminary experiments with state-of-the-art classification algorithms
based on pre-trained linguistic models suggest that target information is
helpful for this task. We make our data and pre-trained models publicly
available.
Related papers
- AgoraSpeech: A multi-annotated comprehensive dataset of political discourse through the lens of humans and AI [1.3060410279656598]
AgoraSpeech is a meticulously curated, high-quality dataset of 171 political speeches from six parties during the Greek national elections in 2023.
The dataset includes annotations (per paragraph) for six natural language processing (NLP) tasks: text classification, topic identification, sentiment analysis, named entity recognition, polarization and populism detection.
arXiv Detail & Related papers (2025-01-09T18:17:59Z) - Political-LLM: Large Language Models in Political Science [159.95299889946637]
Large language models (LLMs) have been widely adopted in political science tasks.
Political-LLM aims to advance the comprehensive understanding of integrating LLMs into computational political science.
arXiv Detail & Related papers (2024-12-09T08:47:50Z) - On the Use of Proxies in Political Ad Targeting [49.61009579554272]
We show that major political advertisers circumvented mitigations by targeting proxy attributes.
Our findings have crucial implications for the ongoing discussion on the regulation of political advertising.
arXiv Detail & Related papers (2024-10-18T17:15:13Z) - Classifying Human-Generated and AI-Generated Election Claims in Social Media [8.990994727335064]
Malicious actors may use social media to disseminate misinformation to undermine trust in the electoral process.
The emergence of Large Language Models (LLMs) exacerbates this issue by enabling malicious actors to generate misinformation at an unprecedented scale.
We present a novel taxonomy for characterizing election-related claims.
arXiv Detail & Related papers (2024-04-24T18:13:29Z) - Uncovering Political Hate Speech During Indian Election Campaign: A New
Low-Resource Dataset and Baselines [3.3228144010758593]
IEHate dataset contains 11,457 manually annotated Hindi tweets related to the Indian Assembly Election Campaign from November 1, 2021, to March 9, 2022.
We benchmark the dataset using a range of machine learning, deep learning, and transformer-based algorithms.
In particular, the relatively higher score of human evaluation over algorithms emphasizes the importance of utilizing both human and automated approaches for effective hate speech moderation.
arXiv Detail & Related papers (2023-06-26T15:17:54Z) - How to Solve Few-Shot Abusive Content Detection Using the Data We Actually Have [58.23138483086277]
In this work we leverage datasets we already have, covering a wide range of tasks related to abusive language detection.
Our goal is to build models cheaply for a new target label set and/or language, using only a few training examples of the target domain.
Our experiments show that using already existing datasets and only a few-shots of the target task the performance of models improve both monolingually and across languages.
arXiv Detail & Related papers (2023-05-23T14:04:12Z) - Design and analysis of tweet-based election models for the 2021 Mexican
legislative election [55.41644538483948]
We use a dataset of 15 million election-related tweets in the six months preceding election day.
We find that models using data with geographical attributes determine the results of the election with better precision and accuracy than conventional polling methods.
arXiv Detail & Related papers (2023-01-02T12:40:05Z) - PolicyQA: A Reading Comprehension Dataset for Privacy Policies [77.79102359580702]
We present PolicyQA, a dataset that contains 25,017 reading comprehension style examples curated from an existing corpus of 115 website privacy policies.
We evaluate two existing neural QA models and perform rigorous analysis to reveal the advantages and challenges offered by PolicyQA.
arXiv Detail & Related papers (2020-10-06T09:04:58Z) - A Survey on Text Classification: From Shallow to Deep Learning [83.47804123133719]
The last decade has seen a surge of research in this area due to the unprecedented success of deep learning.
This paper fills the gap by reviewing the state-of-the-art approaches from 1961 to 2021.
We create a taxonomy for text classification according to the text involved and the models used for feature extraction and classification.
arXiv Detail & Related papers (2020-08-02T00:09:03Z) - Inferring Political Preferences from Twitter [0.0]
Political Sentiment Analysis of social media helps the political strategists to scrutinize the performance of a party or candidate.
During the time of elections, the social networks get flooded with blogs, chats, debates and discussions about the prospects of political parties and politicians.
In this work, we chose to identify the inclination of political opinions present in Tweets by modelling it as a text classification problem using classical machine learning.
arXiv Detail & Related papers (2020-07-21T05:20:43Z) - Political Advertising Dataset: the use case of the Polish 2020
Presidential Elections [4.560033258611709]
We present the first publicly open dataset for detecting specific text chunks and categories of political advertising in the Polish language.
It contains 1,705 human-annotated tweets tagged with nine categories, which constitute campaigning under Polish electoral law.
arXiv Detail & Related papers (2020-06-17T23:58:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.