Cross-Domain Learning for Classifying Propaganda in Online Contents
- URL: http://arxiv.org/abs/2011.06844v2
- Date: Sun, 22 Nov 2020 17:39:46 GMT
- Title: Cross-Domain Learning for Classifying Propaganda in Online Contents
- Authors: Liqiang Wang, Xiaoyu Shen, Gerard de Melo, Gerhard Weikum
- Abstract summary: We present an approach to leverage cross-domain learning, based on labeled documents and sentences from news and tweets, as well as political speeches with a clear difference in their degrees of being propagandistic.
Our experiments demonstrate the usefulness of this approach, and identify difficulties and limitations in various configurations of sources and targets for the transfer step.
- Score: 67.10699378370752
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: As news and social media exhibit an increasing amount of manipulative
polarized content, detecting such propaganda has received attention as a new
task for content analysis. Prior work has focused on supervised learning with
training data from the same domain. However, as propaganda can be subtle and
keeps evolving, manual identification and proper labeling are very demanding.
As a consequence, training data is a major bottleneck. In this paper, we tackle
this bottleneck and present an approach to leverage cross-domain learning,
based on labeled documents and sentences from news and tweets, as well as
political speeches with a clear difference in their degrees of being
propagandistic. We devise informative features and build various classifiers
for propaganda labeling, using cross-domain learning. Our experiments
demonstrate the usefulness of this approach, and identify difficulties and
limitations in various configurations of sources and targets for the transfer
step. We further analyze the influence of various features, and characterize
salient indicators of propaganda.
Related papers
- PropaInsight: Toward Deeper Understanding of Propaganda in Terms of Techniques, Appeals, and Intent [71.20471076045916]
Propaganda plays a critical role in shaping public opinion and fueling disinformation.
Propainsight systematically dissects propaganda into techniques, arousal appeals, and underlying intent.
Propagaze combines human-annotated data with high-quality synthetic data.
arXiv Detail & Related papers (2024-09-19T06:28:18Z) - CDFSL-V: Cross-Domain Few-Shot Learning for Videos [58.37446811360741]
Few-shot video action recognition is an effective approach to recognizing new categories with only a few labeled examples.
Existing methods in video action recognition rely on large labeled datasets from the same domain.
We propose a novel cross-domain few-shot video action recognition method that leverages self-supervised learning and curriculum learning.
arXiv Detail & Related papers (2023-09-07T19:44:27Z) - Faking Fake News for Real Fake News Detection: Propaganda-loaded
Training Data Generation [105.20743048379387]
We propose a novel framework for generating training examples informed by the known styles and strategies of human-authored propaganda.
Specifically, we perform self-critical sequence training guided by natural language inference to ensure the validity of the generated articles.
Our experimental results show that fake news detectors trained on PropaNews are better at detecting human-written disinformation by 3.62 - 7.69% F1 score on two public datasets.
arXiv Detail & Related papers (2022-03-10T14:24:19Z) - Dataset of Propaganda Techniques of the State-Sponsored Information
Operation of the People's Republic of China [0.0]
This research aims to bridge the information gap by providing a multi-labeled propaganda techniques dataset in Mandarin based on a state-backed information operation dataset provided by Twitter.
In addition to presenting the dataset, we apply a multi-label text classification using fine-tuned BERT.
arXiv Detail & Related papers (2021-06-14T16:11:13Z) - Embracing Domain Differences in Fake News: Cross-domain Fake News
Detection using Multi-modal Data [18.66426327152407]
We propose a novel framework that jointly preserves domain-specific and cross-domain knowledge in news records to detect fake news from different domains.
Our experiments show that the integration of the proposed fake news model and the selective annotation approach achieves state-of-the-art performance for cross-domain news datasets.
arXiv Detail & Related papers (2021-02-11T23:31:14Z) - LTIatCMU at SemEval-2020 Task 11: Incorporating Multi-Level Features for
Multi-Granular Propaganda Span Identification [70.1903083747775]
This paper describes our submission for the task of Propaganda Span Identification in news articles.
We introduce a BERT-BiLSTM based span-level propaganda classification model that identifies which token spans within the sentence are indicative of propaganda.
arXiv Detail & Related papers (2020-08-11T16:14:47Z) - BPGC at SemEval-2020 Task 11: Propaganda Detection in News Articles with
Multi-Granularity Knowledge Sharing and Linguistic Features based Ensemble
Learning [2.8913142991383114]
SemEval 2020 Task-11 aims to design automated systems for news propaganda detection.
Task-11 consists of two sub-tasks, namely, Span Identification and Technique Classification.
arXiv Detail & Related papers (2020-05-31T19:35:53Z) - Leveraging Declarative Knowledge in Text and First-Order Logic for
Fine-Grained Propaganda Detection [139.3415751957195]
We study the detection of propagandistic text fragments in news articles.
We introduce an approach to inject declarative knowledge of fine-grained propaganda techniques.
arXiv Detail & Related papers (2020-04-29T13:46:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.