How Vulnerable Are Automatic Fake News Detection Methods to Adversarial
Attacks?
- URL: http://arxiv.org/abs/2107.07970v1
- Date: Fri, 16 Jul 2021 15:36:03 GMT
- Title: How Vulnerable Are Automatic Fake News Detection Methods to Adversarial
Attacks?
- Authors: Camille Koenders, Johannes Filla, Nicolai Schneider, Vinicius Woloszyn
- Abstract summary: This paper shows that it is possible to automatically attack state-of-the-art models that have been trained to detect Fake News.
The results show that it is possible to automatically bypass Fake News detection mechanisms, leading to implications concerning existing policy initiatives.
- Score: 0.6882042556551611
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: As the spread of false information on the internet has increased dramatically
in recent years, more and more attention is being paid to automated fake news
detection. Some fake news detection methods are already quite successful.
Nevertheless, there are still many vulnerabilities in the detection algorithms.
The reason for this is that fake news publishers can structure and formulate
their texts in such a way that a detection algorithm does not expose this text
as fake news. This paper shows that it is possible to automatically attack
state-of-the-art models that have been trained to detect Fake News, making
these vulnerable. For this purpose, corresponding models were first trained
based on a dataset. Then, using Text-Attack, an attempt was made to manipulate
the trained models in such a way that previously correctly identified fake news
was classified as true news. The results show that it is possible to
automatically bypass Fake News detection mechanisms, leading to implications
concerning existing policy initiatives.
Related papers
- Adapting Fake News Detection to the Era of Large Language Models [48.5847914481222]
We study the interplay between machine-(paraphrased) real news, machine-generated fake news, human-written fake news, and human-written real news.
Our experiments reveal an interesting pattern that detectors trained exclusively on human-written articles can indeed perform well at detecting machine-generated fake news, but not vice versa.
arXiv Detail & Related papers (2023-11-02T08:39:45Z) - Verifying the Robustness of Automatic Credibility Assessment [50.55687778699995]
We show that meaning-preserving changes in input text can mislead the models.
We also introduce BODEGA: a benchmark for testing both victim models and attack methods on misinformation detection tasks.
Our experimental results show that modern large language models are often more vulnerable to attacks than previous, smaller solutions.
arXiv Detail & Related papers (2023-03-14T16:11:47Z) - Exploring Semantic Perturbations on Grover [6.515466136870902]
The rise of neural fake news (AI-generated fake news) has prompted the development of models to detect it.
One such model is the Grover model, which can both detect neural fake news to prevent it, and generate it to demonstrate how a model could be misused.
In this work we explore the Grover model's fake news detection capabilities by performing targeted attacks through perturbations on input news articles.
arXiv Detail & Related papers (2023-02-01T15:28:55Z) - Multiverse: Multilingual Evidence for Fake News Detection [71.51905606492376]
Multiverse is a new feature based on multilingual evidence that can be used for fake news detection.
The hypothesis of the usage of cross-lingual evidence as a feature for fake news detection is confirmed.
arXiv Detail & Related papers (2022-11-25T18:24:17Z) - Faking Fake News for Real Fake News Detection: Propaganda-loaded
Training Data Generation [105.20743048379387]
We propose a novel framework for generating training examples informed by the known styles and strategies of human-authored propaganda.
Specifically, we perform self-critical sequence training guided by natural language inference to ensure the validity of the generated articles.
Our experimental results show that fake news detectors trained on PropaNews are better at detecting human-written disinformation by 3.62 - 7.69% F1 score on two public datasets.
arXiv Detail & Related papers (2022-03-10T14:24:19Z) - Automated Fake News Detection using cross-checking with reliable sources [0.0]
We use natural human behavior to cross-check new information with reliable sources.
We implement this for Twitter and build a model that flags fake tweets.
Our implementation of this approach gives a $70%$ accuracy which outperforms other generic fake-news classification models.
arXiv Detail & Related papers (2022-01-01T00:59:58Z) - User Preference-aware Fake News Detection [61.86175081368782]
Existing fake news detection algorithms focus on mining news content for deceptive signals.
We propose a new framework, UPFD, which simultaneously captures various signals from user preferences by joint content and graph modeling.
arXiv Detail & Related papers (2021-04-25T21:19:24Z) - How does Truth Evolve into Fake News? An Empirical Study of Fake News
Evolution [55.27685924751459]
We present the Fake News Evolution dataset: a new dataset tracking the fake news evolution process.
Our dataset is composed of 950 paired data, each of which consists of articles representing the truth, the fake news, and the evolved fake news.
We observe the features during the evolution and they are the disinformation techniques, text similarity, top 10 keywords, classification accuracy, parts of speech, and sentiment properties.
arXiv Detail & Related papers (2021-03-10T09:01:34Z) - Connecting the Dots Between Fact Verification and Fake News Detection [21.564628184287173]
We propose a simple yet effective approach to connect the dots between fact verification and fake news detection.
Our approach makes use of the recent success of fact verification models and enables zero-shot fake news detection.
arXiv Detail & Related papers (2020-10-11T09:28:52Z) - MALCOM: Generating Malicious Comments to Attack Neural Fake News
Detection Models [40.51057705796747]
MALCOM is an end-to-end adversarial comment generation framework to achieve such an attack.
We demonstrate that about 94% and 93.5% of the time on average MALCOM can successfully mislead five of the latest neural detection models.
We also compare our attack model with four baselines across two real-world datasets.
arXiv Detail & Related papers (2020-09-01T01:26:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.