Machine Generation and Detection of Arabic Manipulated and Fake News
- URL: http://arxiv.org/abs/2011.03092v1
- Date: Thu, 5 Nov 2020 20:50:22 GMT
- Title: Machine Generation and Detection of Arabic Manipulated and Fake News
- Authors: El Moatez Billah Nagoudi, AbdelRahim Elmadany, Muhammad Abdul-Mageed,
Tariq Alhindi, Hasan Cavusoglu
- Abstract summary: We present a novel method for automatically generating Arabic manipulated (and potentially fake) news stories.
Our method is simple and only depends on availability of true stories, which are abundant online, and a part of speech tagger (POS)
We carry out a human annotation study that casts light on the effects of machine manipulation on text veracity.
We develop the first models for detecting manipulated Arabic news and achieve state-of-the-art results.
- Score: 8.014703200985084
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Fake news and deceptive machine-generated text are serious problems
threatening modern societies, including in the Arab world. This motivates work
on detecting false and manipulated stories online. However, a bottleneck for
this research is lack of sufficient data to train detection models. We present
a novel method for automatically generating Arabic manipulated (and potentially
fake) news stories. Our method is simple and only depends on availability of
true stories, which are abundant online, and a part of speech tagger (POS). To
facilitate future work, we dispense with both of these requirements altogether
by providing AraNews, a novel and large POS-tagged news dataset that can be
used off-the-shelf. Using stories generated based on AraNews, we carry out a
human annotation study that casts light on the effects of machine manipulation
on text veracity. The study also measures human ability to detect Arabic
machine manipulated text generated by our method. Finally, we develop the first
models for detecting manipulated Arabic news and achieve state-of-the-art
results on Arabic fake news detection (macro F1=70.06). Our models and data are
publicly available.
Related papers
- Detection of Human and Machine-Authored Fake News in Urdu [2.013675429941823]
Social media has amplified the spread of fake news.
Traditional fake news detection methods relying on linguistic cues become less effective.
We propose a hierarchical detection strategy to improve the accuracy and robustness.
arXiv Detail & Related papers (2024-10-25T12:42:07Z) - LLM-DetectAIve: a Tool for Fine-Grained Machine-Generated Text Detection [87.43727192273772]
It is often hard to tell whether a piece of text was human-written or machine-generated.
We present LLM-DetectAIve, designed for fine-grained detection.
It supports four categories: (i) human-written, (ii) machine-generated, (iii) machine-written, then machine-humanized, and (iv) human-written, then machine-polished.
arXiv Detail & Related papers (2024-08-08T07:43:17Z) - Adapting Fake News Detection to the Era of Large Language Models [48.5847914481222]
We study the interplay between machine-(paraphrased) real news, machine-generated fake news, human-written fake news, and human-written real news.
Our experiments reveal an interesting pattern that detectors trained exclusively on human-written articles can indeed perform well at detecting machine-generated fake news, but not vice versa.
arXiv Detail & Related papers (2023-11-02T08:39:45Z) - Smaller Language Models are Better Black-box Machine-Generated Text
Detectors [56.36291277897995]
Small and partially-trained models are better universal text detectors.
We find that whether the detector and generator were trained on the same data is not critically important to the detection success.
For instance, the OPT-125M model has an AUC of 0.81 in detecting ChatGPT generations, whereas a larger model from the GPT family, GPTJ-6B, has AUC of 0.45.
arXiv Detail & Related papers (2023-05-17T00:09:08Z) - Multiverse: Multilingual Evidence for Fake News Detection [71.51905606492376]
Multiverse is a new feature based on multilingual evidence that can be used for fake news detection.
The hypothesis of the usage of cross-lingual evidence as a feature for fake news detection is confirmed.
arXiv Detail & Related papers (2022-11-25T18:24:17Z) - Faking Fake News for Real Fake News Detection: Propaganda-loaded
Training Data Generation [105.20743048379387]
We propose a novel framework for generating training examples informed by the known styles and strategies of human-authored propaganda.
Specifically, we perform self-critical sequence training guided by natural language inference to ensure the validity of the generated articles.
Our experimental results show that fake news detectors trained on PropaNews are better at detecting human-written disinformation by 3.62 - 7.69% F1 score on two public datasets.
arXiv Detail & Related papers (2022-03-10T14:24:19Z) - How Vulnerable Are Automatic Fake News Detection Methods to Adversarial
Attacks? [0.6882042556551611]
This paper shows that it is possible to automatically attack state-of-the-art models that have been trained to detect Fake News.
The results show that it is possible to automatically bypass Fake News detection mechanisms, leading to implications concerning existing policy initiatives.
arXiv Detail & Related papers (2021-07-16T15:36:03Z) - BERT Transformer model for Detecting Arabic GPT2 Auto-Generated Tweets [6.18447297698017]
We propose a transfer learning based model that will be able to detect if an Arabic sentence is written by humans or automatically generated by bots.
Our new transfer-learning model has obtained an accuracy up to 98%.
To the best of our knowledge, this work is the first study where ARABERT and GPT2 were combined to detect and classify the Arabic auto-generated texts.
arXiv Detail & Related papers (2021-01-22T21:50:38Z) - Machine Learning Explanations to Prevent Overtrust in Fake News
Detection [64.46876057393703]
This research investigates the effects of an Explainable AI assistant embedded in news review platforms for combating the propagation of fake news.
We design a news reviewing and sharing interface, create a dataset of news stories, and train four interpretable fake news detection algorithms.
For a deeper understanding of Explainable AI systems, we discuss interactions between user engagement, mental model, trust, and performance measures in the process of explaining.
arXiv Detail & Related papers (2020-07-24T05:42:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.