MegaFake: A Theory-Driven Dataset of Fake News Generated by Large Language Models
- URL: http://arxiv.org/abs/2408.11871v2
- Date: Wed, 25 Sep 2024 06:21:26 GMT
- Title: MegaFake: A Theory-Driven Dataset of Fake News Generated by Large Language Models
- Authors: Lionel Z. Wang, Yiming Ma, Renfei Gao, Beichen Guo, Han Zhu, Wenqi Fan, Zexin Lu, Ka Chung Ng,
- Abstract summary: We analyze the creation of fake news from a social psychology perspective.
We develop a comprehensive LLM-based theoretical framework, LLM-Fake Theory.
We conduct comprehensive analyses to evaluate our MegaFake dataset.
- Score: 18.708519905776562
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The advent of large language models (LLMs) has revolutionized online content creation, making it much easier to generate high-quality fake news. This misuse threatens the integrity of our digital environment and ethical standards. Therefore, understanding the motivations and mechanisms behind LLM-generated fake news is crucial. In this study, we analyze the creation of fake news from a social psychology perspective and develop a comprehensive LLM-based theoretical framework, LLM-Fake Theory. We introduce a novel pipeline that automates the generation of fake news using LLMs, thereby eliminating the need for manual annotation. Utilizing this pipeline, we create a theoretically informed Machine-generated Fake news dataset, MegaFake, derived from the GossipCop dataset. We conduct comprehensive analyses to evaluate our MegaFake dataset. We believe that our dataset and insights will provide valuable contributions to future research focused on the detection and governance of fake news in the era of LLMs.
Related papers
- Idiosyncrasies in Large Language Models [54.26923012617675]
We unveil and study idiosyncrasies in Large Language Models (LLMs)
We find that fine-tuning existing text embedding models on LLM-generated texts yields excellent classification accuracy.
We leverage LLM as judges to generate detailed, open-ended descriptions of each model's idiosyncrasies.
arXiv Detail & Related papers (2025-02-17T18:59:02Z) - NewsEdits 2.0: Learning the Intentions Behind Updating News [74.84017890548259]
As events progress, news articles often update with new information: if we are not cautious, we risk propagating outdated facts.
In this work, we hypothesize that linguistic features indicate factual fluidity, and that we can predict which facts in a news article will update using solely the text of a news article.
arXiv Detail & Related papers (2024-11-27T23:35:23Z) - From Deception to Detection: The Dual Roles of Large Language Models in Fake News [0.20482269513546458]
Fake news poses a significant threat to the integrity of information ecosystems and public trust.
The advent of Large Language Models (LLMs) holds considerable promise for transforming the battle against fake news.
This paper explores the capability of various LLMs in effectively combating fake news.
arXiv Detail & Related papers (2024-09-25T22:57:29Z) - LLM-GAN: Construct Generative Adversarial Network Through Large Language Models For Explainable Fake News Detection [34.984605500444324]
Large Language Models (LLMs) are known for their powerful natural language understanding and explanation generation abilities.
We propose LLM-GAN, a novel framework that utilizes prompting mechanisms to enable an LLM to become Generator and Detector.
Our results demonstrate LLM-GAN's effectiveness in both prediction performance and explanation quality.
arXiv Detail & Related papers (2024-09-03T11:06:45Z) - Detect, Investigate, Judge and Determine: A Knowledge-guided Framework for Few-shot Fake News Detection [50.079690200471454]
Few-Shot Fake News Detection (FS-FND) aims to distinguish inaccurate news from real ones in extremely low-resource scenarios.
This task has garnered increased attention due to the widespread dissemination and harmful impact of fake news on social media.
We propose a Dual-perspective Knowledge-guided Fake News Detection (DKFND) model, designed to enhance LLMs from both inside and outside perspectives.
arXiv Detail & Related papers (2024-07-12T03:15:01Z) - Seeing Through AI's Lens: Enhancing Human Skepticism Towards LLM-Generated Fake News [0.38233569758620056]
This paper aims to elucidate simple markers that help individuals distinguish between articles penned by humans and those created by LLMs.
We then devise a metric named Entropy-Shift Authorship Signature (ESAS) based on the information theory and entropy principles.
The proposed ESAS ranks terms or entities, like POS tagging, within news articles based on their relevance in discerning article authorship.
arXiv Detail & Related papers (2024-06-20T06:02:04Z) - FKA-Owl: Advancing Multimodal Fake News Detection through Knowledge-Augmented LVLMs [48.32113486904612]
We propose FKA-Owl, a framework that leverages forgery-specific knowledge to augment Large Vision-Language Models (LVLMs)
Experiments on the public benchmark demonstrate that FKA-Owl achieves superior cross-domain performance compared to previous methods.
arXiv Detail & Related papers (2024-03-04T12:35:09Z) - Disinformation Capabilities of Large Language Models [0.564232659769944]
This paper presents a study of the disinformation capabilities of the current generation of large language models (LLMs)
We evaluated the capabilities of 10 LLMs using 20 disinformation narratives.
We conclude that LLMs are able to generate convincing news articles that agree with dangerous disinformation narratives.
arXiv Detail & Related papers (2023-11-15T10:25:30Z) - Adapting Fake News Detection to the Era of Large Language Models [48.5847914481222]
We study the interplay between machine-(paraphrased) real news, machine-generated fake news, human-written fake news, and human-written real news.
Our experiments reveal an interesting pattern that detectors trained exclusively on human-written articles can indeed perform well at detecting machine-generated fake news, but not vice versa.
arXiv Detail & Related papers (2023-11-02T08:39:45Z) - Fake News Detectors are Biased against Texts Generated by Large Language
Models [39.36284616311687]
The spread of fake news has emerged as a critical challenge, undermining trust and posing threats to society.
We present a novel paradigm to evaluate fake news detectors in scenarios involving both human-written and LLM-generated misinformation.
arXiv Detail & Related papers (2023-09-15T18:04:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.