Large Language Models for Multi-label Propaganda Detection
- URL: http://arxiv.org/abs/2210.08209v1
- Date: Sat, 15 Oct 2022 06:47:31 GMT
- Title: Large Language Models for Multi-label Propaganda Detection
- Authors: Tanmay Chavan and Aditya Kane
- Abstract summary: We describe our approach for the WANLP 2022 shared task which handles the task of propaganda detection in a multi-label setting.
The task demands the model to label the given text as having one or more types of propaganda techniques.
We show that an ensemble of five models performs the best on the task, scoring a micro-F1 score of 59.73%.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The spread of propaganda through the internet has increased drastically over
the past years. Lately, propaganda detection has started gaining importance
because of the negative impact it has on society. In this work, we describe our
approach for the WANLP 2022 shared task which handles the task of propaganda
detection in a multi-label setting. The task demands the model to label the
given text as having one or more types of propaganda techniques. There are a
total of 22 propaganda techniques to be detected. We show that an ensemble of
five models performs the best on the task, scoring a micro-F1 score of 59.73%.
We also conduct comprehensive ablations and propose various future directions
for this work.
Related papers
- PropaInsight: Toward Deeper Understanding of Propaganda in Terms of Techniques, Appeals, and Intent [71.20471076045916]
Propaganda plays a critical role in shaping public opinion and fueling disinformation.
Propainsight systematically dissects propaganda into techniques, arousal appeals, and underlying intent.
Propagaze combines human-annotated data with high-quality synthetic data.
arXiv Detail & Related papers (2024-09-19T06:28:18Z) - Can GPT-4 Identify Propaganda? Annotation and Detection of Propaganda
Spans in News Articles [11.64165958410489]
We develop the largest propaganda dataset to date, comprised of 8K paragraphs from newspaper articles, labeled at the text span level following a taxonomy of 23 propagandistic techniques.
Our work offers the first attempt to understand the performance of large language models (LLMs), using GPT-4, for fine-grained propaganda detection from text.
Results showed that GPT-4's performance degrades as the task moves from simply classifying a paragraph as propagandistic or not, to the fine-grained task of detecting propaganda techniques and their manifestation in text.
arXiv Detail & Related papers (2024-02-27T13:02:19Z) - Multimodal Propaganda Processing [34.295018092278255]
We introduce the task of multimodal propaganda processing, where the goal is to automatically analyze propaganda content.
We believe that this task presents a long-term challenge to AI researchers and that successful processing of propaganda could bring machine understanding one important step closer to human understanding.
arXiv Detail & Related papers (2023-02-17T05:49:55Z) - Overview of the WANLP 2022 Shared Task on Propaganda Detection in Arabic [32.27059493109764]
We ran a task on detecting propaganda techniques in Arabic tweets as part of the WANLP 2022 workshop.
Subtask1 asks to identify the set of propaganda techniques used in a tweet, which is a multilabel classification problem.
Subtask2 asks to detect the propaganda techniques used in a tweet together with the exact span(s) of text in which each propaganda technique appears.
arXiv Detail & Related papers (2022-11-18T07:04:31Z) - Persuasion Strategies in Advertisements [68.70313043201882]
We introduce an extensive vocabulary of persuasion strategies and build the first ad image corpus annotated with persuasion strategies.
We then formulate the task of persuasion strategy prediction with multi-modal learning.
We conduct a real-world case study on 1600 advertising campaigns of 30 Fortune-500 companies.
arXiv Detail & Related papers (2022-08-20T07:33:13Z) - Faking Fake News for Real Fake News Detection: Propaganda-loaded
Training Data Generation [105.20743048379387]
We propose a novel framework for generating training examples informed by the known styles and strategies of human-authored propaganda.
Specifically, we perform self-critical sequence training guided by natural language inference to ensure the validity of the generated articles.
Our experimental results show that fake news detectors trained on PropaNews are better at detecting human-written disinformation by 3.62 - 7.69% F1 score on two public datasets.
arXiv Detail & Related papers (2022-03-10T14:24:19Z) - Detecting Propaganda Techniques in Memes [32.209606526323945]
We propose a new multi-label multimodal task: detecting the type of propaganda techniques used in memes.
We create and release a new corpus of 950 memes, carefully annotated with 22 propaganda techniques, which can appear in the text, in the image, or in both.
Our analysis of the corpus shows that understanding both modalities together is essential for detecting these techniques.
arXiv Detail & Related papers (2021-08-07T11:56:52Z) - Cross-Domain Learning for Classifying Propaganda in Online Contents [67.10699378370752]
We present an approach to leverage cross-domain learning, based on labeled documents and sentences from news and tweets, as well as political speeches with a clear difference in their degrees of being propagandistic.
Our experiments demonstrate the usefulness of this approach, and identify difficulties and limitations in various configurations of sources and targets for the transfer step.
arXiv Detail & Related papers (2020-11-13T10:19:13Z) - UPB at SemEval-2020 Task 11: Propaganda Detection with Domain-Specific
Trained BERT [0.3437656066916039]
This paper describes our participation in the SemEval-2020, Task 11: Detection of Propaganda Techniques in News Articles competition.
Our approach considers specializing a pre-trained BERT model on propagandistic and hyperpartisan news articles.
Our proposed system achieved a F1-score of 46.060% in subtask SI, ranking 5th in the leaderboard from 36 teams and a micro-averaged F1 score of 54.302% for subtask TC, ranking 19th from 32 teams.
arXiv Detail & Related papers (2020-09-11T08:44:14Z) - LTIatCMU at SemEval-2020 Task 11: Incorporating Multi-Level Features for
Multi-Granular Propaganda Span Identification [70.1903083747775]
This paper describes our submission for the task of Propaganda Span Identification in news articles.
We introduce a BERT-BiLSTM based span-level propaganda classification model that identifies which token spans within the sentence are indicative of propaganda.
arXiv Detail & Related papers (2020-08-11T16:14:47Z) - Leveraging Declarative Knowledge in Text and First-Order Logic for
Fine-Grained Propaganda Detection [139.3415751957195]
We study the detection of propagandistic text fragments in news articles.
We introduce an approach to inject declarative knowledge of fine-grained propaganda techniques.
arXiv Detail & Related papers (2020-04-29T13:46:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.