Dataset of Propaganda Techniques of the State-Sponsored Information
Operation of the People's Republic of China
- URL: http://arxiv.org/abs/2106.07544v1
- Date: Mon, 14 Jun 2021 16:11:13 GMT
- Title: Dataset of Propaganda Techniques of the State-Sponsored Information
Operation of the People's Republic of China
- Authors: Rong-Ching Chang, Chun-Ming Lai, Kai-Lai Chang, Chu-Hsing Lin
- Abstract summary: This research aims to bridge the information gap by providing a multi-labeled propaganda techniques dataset in Mandarin based on a state-backed information operation dataset provided by Twitter.
In addition to presenting the dataset, we apply a multi-label text classification using fine-tuned BERT.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: The digital media, identified as computational propaganda provides a pathway
for propaganda to expand its reach without limit. State-backed propaganda aims
to shape the audiences' cognition toward entities in favor of a certain
political party or authority. Furthermore, it has become part of modern
information warfare used in order to gain an advantage over opponents. Most of
the current studies focus on using machine learning, quantitative, and
qualitative methods to distinguish if a certain piece of information on social
media is propaganda. Mainly conducted on English content, but very little
research addresses Chinese Mandarin content. From propaganda detection, we want
to go one step further to provide more fine-grained information on propaganda
techniques that are applied. In this research, we aim to bridge the information
gap by providing a multi-labeled propaganda techniques dataset in Mandarin
based on a state-backed information operation dataset provided by Twitter. In
addition to presenting the dataset, we apply a multi-label text classification
using fine-tuned BERT. Potentially this could help future research in detecting
state-backed propaganda online especially in a cross-lingual context and cross
platforms identity consolidation.
Related papers
- PropaInsight: Toward Deeper Understanding of Propaganda in Terms of Techniques, Appeals, and Intent [71.20471076045916]
Propaganda plays a critical role in shaping public opinion and fueling disinformation.
Propainsight systematically dissects propaganda into techniques, arousal appeals, and underlying intent.
Propagaze combines human-annotated data with high-quality synthetic data.
arXiv Detail & Related papers (2024-09-19T06:28:18Z) - Can GPT-4 Identify Propaganda? Annotation and Detection of Propaganda
Spans in News Articles [11.64165958410489]
We develop the largest propaganda dataset to date, comprised of 8K paragraphs from newspaper articles, labeled at the text span level following a taxonomy of 23 propagandistic techniques.
Our work offers the first attempt to understand the performance of large language models (LLMs), using GPT-4, for fine-grained propaganda detection from text.
Results showed that GPT-4's performance degrades as the task moves from simply classifying a paragraph as propagandistic or not, to the fine-grained task of detecting propaganda techniques and their manifestation in text.
arXiv Detail & Related papers (2024-02-27T13:02:19Z) - Multimodal Propaganda Processing [34.295018092278255]
We introduce the task of multimodal propaganda processing, where the goal is to automatically analyze propaganda content.
We believe that this task presents a long-term challenge to AI researchers and that successful processing of propaganda could bring machine understanding one important step closer to human understanding.
arXiv Detail & Related papers (2023-02-17T05:49:55Z) - Faking Fake News for Real Fake News Detection: Propaganda-loaded
Training Data Generation [105.20743048379387]
We propose a novel framework for generating training examples informed by the known styles and strategies of human-authored propaganda.
Specifically, we perform self-critical sequence training guided by natural language inference to ensure the validity of the generated articles.
Our experimental results show that fake news detectors trained on PropaNews are better at detecting human-written disinformation by 3.62 - 7.69% F1 score on two public datasets.
arXiv Detail & Related papers (2022-03-10T14:24:19Z) - Cross-Domain Learning for Classifying Propaganda in Online Contents [67.10699378370752]
We present an approach to leverage cross-domain learning, based on labeled documents and sentences from news and tweets, as well as political speeches with a clear difference in their degrees of being propagandistic.
Our experiments demonstrate the usefulness of this approach, and identify difficulties and limitations in various configurations of sources and targets for the transfer step.
arXiv Detail & Related papers (2020-11-13T10:19:13Z) - LTIatCMU at SemEval-2020 Task 11: Incorporating Multi-Level Features for
Multi-Granular Propaganda Span Identification [70.1903083747775]
This paper describes our submission for the task of Propaganda Span Identification in news articles.
We introduce a BERT-BiLSTM based span-level propaganda classification model that identifies which token spans within the sentence are indicative of propaganda.
arXiv Detail & Related papers (2020-08-11T16:14:47Z) - Political audience diversity and news reliability in algorithmic ranking [54.23273310155137]
We propose using the political diversity of a website's audience as a quality signal.
Using news source reliability ratings from domain experts and web browsing data from a diverse sample of 6,890 U.S. citizens, we first show that websites with more extreme and less politically diverse audiences have lower journalistic standards.
arXiv Detail & Related papers (2020-07-16T02:13:55Z) - Prta: A System to Support the Analysis of Propaganda Techniques in the
News [34.61449860876045]
Prta allows users to explore the articles crawled on a regular basis by highlighting the spans in which propaganda techniques occur.
The system further reports statistics about the use of such techniques, overall and over time, or according to filtering criteria specified by the user.
It allows users to analyze any text or URL through a dedicated interface or via an API.
arXiv Detail & Related papers (2020-05-12T15:20:55Z) - Leveraging Declarative Knowledge in Text and First-Order Logic for
Fine-Grained Propaganda Detection [139.3415751957195]
We study the detection of propagandistic text fragments in news articles.
We introduce an approach to inject declarative knowledge of fine-grained propaganda techniques.
arXiv Detail & Related papers (2020-04-29T13:46:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.