UPB at SemEval-2020 Task 11: Propaganda Detection with Domain-Specific
Trained BERT
- URL: http://arxiv.org/abs/2009.05289v1
- Date: Fri, 11 Sep 2020 08:44:14 GMT
- Title: UPB at SemEval-2020 Task 11: Propaganda Detection with Domain-Specific
Trained BERT
- Authors: Andrei Paraschiv, Dumitru-Clementin Cercel, Mihai Dascalu
- Abstract summary: This paper describes our participation in the SemEval-2020, Task 11: Detection of Propaganda Techniques in News Articles competition.
Our approach considers specializing a pre-trained BERT model on propagandistic and hyperpartisan news articles.
Our proposed system achieved a F1-score of 46.060% in subtask SI, ranking 5th in the leaderboard from 36 teams and a micro-averaged F1 score of 54.302% for subtask TC, ranking 19th from 32 teams.
- Score: 0.3437656066916039
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Manipulative and misleading news have become a commodity for some online news
outlets and these news have gained a significant impact on the global mindset
of people. Propaganda is a frequently employed manipulation method having as
goal to influence readers by spreading ideas meant to distort or manipulate
their opinions. This paper describes our participation in the SemEval-2020,
Task 11: Detection of Propaganda Techniques in News Articles competition. Our
approach considers specializing a pre-trained BERT model on propagandistic and
hyperpartisan news articles, enabling it to create more adequate
representations for the two subtasks, namely propaganda Span Identification
(SI) and propaganda Technique Classification (TC). Our proposed system achieved
a F1-score of 46.060% in subtask SI, ranking 5th in the leaderboard from 36
teams and a micro-averaged F1 score of 54.302% for subtask TC, ranking 19th
from 32 teams.
Related papers
- PropaInsight: Toward Deeper Understanding of Propaganda in Terms of Techniques, Appeals, and Intent [71.20471076045916]
Propaganda plays a critical role in shaping public opinion and fueling disinformation.
Propainsight systematically dissects propaganda into techniques, arousal appeals, and underlying intent.
Propagaze combines human-annotated data with high-quality synthetic data.
arXiv Detail & Related papers (2024-09-19T06:28:18Z) - Overview of the WANLP 2022 Shared Task on Propaganda Detection in Arabic [32.27059493109764]
We ran a task on detecting propaganda techniques in Arabic tweets as part of the WANLP 2022 workshop.
Subtask1 asks to identify the set of propaganda techniques used in a tweet, which is a multilabel classification problem.
Subtask2 asks to detect the propaganda techniques used in a tweet together with the exact span(s) of text in which each propaganda technique appears.
arXiv Detail & Related papers (2022-11-18T07:04:31Z) - Large Language Models for Multi-label Propaganda Detection [0.0]
We describe our approach for the WANLP 2022 shared task which handles the task of propaganda detection in a multi-label setting.
The task demands the model to label the given text as having one or more types of propaganda techniques.
We show that an ensemble of five models performs the best on the task, scoring a micro-F1 score of 59.73%.
arXiv Detail & Related papers (2022-10-15T06:47:31Z) - Overview of the Shared Task on Fake News Detection in Urdu at FIRE 2020 [62.6928395368204]
Task was posed as a binary classification task, in which the goal is to differentiate between real and fake news.
We provided a dataset divided into 900 annotated news articles for training and 400 news articles for testing.
42 teams from 6 different countries (India, China, Egypt, Germany, Pakistan, and the UK) registered for the task.
arXiv Detail & Related papers (2022-07-25T03:41:32Z) - Overview of the Shared Task on Fake News Detection in Urdu at FIRE 2021 [55.41644538483948]
The goal of the shared task is to motivate the community to come up with efficient methods for solving this vital problem.
The training set contains 1300 annotated news articles -- 750 real news, 550 fake news, while the testing set contains 300 news articles -- 200 real, 100 fake news.
The best performing system obtained an F1-macro score of 0.679, which is lower than the past year's best result of 0.907 F1-macro.
arXiv Detail & Related papers (2022-07-11T18:58:36Z) - Faking Fake News for Real Fake News Detection: Propaganda-loaded
Training Data Generation [105.20743048379387]
We propose a novel framework for generating training examples informed by the known styles and strategies of human-authored propaganda.
Specifically, we perform self-critical sequence training guided by natural language inference to ensure the validity of the generated articles.
Our experimental results show that fake news detectors trained on PropaNews are better at detecting human-written disinformation by 3.62 - 7.69% F1 score on two public datasets.
arXiv Detail & Related papers (2022-03-10T14:24:19Z) - Cross-Domain Learning for Classifying Propaganda in Online Contents [67.10699378370752]
We present an approach to leverage cross-domain learning, based on labeled documents and sentences from news and tweets, as well as political speeches with a clear difference in their degrees of being propagandistic.
Our experiments demonstrate the usefulness of this approach, and identify difficulties and limitations in various configurations of sources and targets for the transfer step.
arXiv Detail & Related papers (2020-11-13T10:19:13Z) - SemEval-2020 Task 11: Detection of Propaganda Techniques in News
Articles [0.6999740786886536]
We present the results of SemEval-2020 Task 11 on Detection of Propaganda Techniques in News Articles.
The task featured two subtasks: Span Identification and Technique Classification.
For both subtasks, the best systems used pre-trained Transformers and ensembles.
arXiv Detail & Related papers (2020-09-06T10:05:43Z) - LTIatCMU at SemEval-2020 Task 11: Incorporating Multi-Level Features for
Multi-Granular Propaganda Span Identification [70.1903083747775]
This paper describes our submission for the task of Propaganda Span Identification in news articles.
We introduce a BERT-BiLSTM based span-level propaganda classification model that identifies which token spans within the sentence are indicative of propaganda.
arXiv Detail & Related papers (2020-08-11T16:14:47Z) - newsSweeper at SemEval-2020 Task 11: Context-Aware Rich Feature
Representations For Propaganda Classification [2.0491741153610334]
This paper describes our submissions to SemEval 2020 Task 11: Detection of Propaganda Techniques in News Articles.
We make use of pre-trained BERT language model enhanced with tagging techniques developed for the task of Named Entity Recognition.
For the second subtask, we incorporate contextual features in a pre-trained RoBERTa model for the classification of propaganda techniques.
arXiv Detail & Related papers (2020-07-21T14:06:59Z) - BPGC at SemEval-2020 Task 11: Propaganda Detection in News Articles with
Multi-Granularity Knowledge Sharing and Linguistic Features based Ensemble
Learning [2.8913142991383114]
SemEval 2020 Task-11 aims to design automated systems for news propaganda detection.
Task-11 consists of two sub-tasks, namely, Span Identification and Technique Classification.
arXiv Detail & Related papers (2020-05-31T19:35:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.