Related papers: Adversarial Style Augmentation via Large Language Model for Robust Fake News Detection

Adversarial Style Augmentation via Large Language Model for Robust Fake News Detection

URL: http://arxiv.org/abs/2406.11260v2
Date: Mon, 22 Jul 2024 11:56:44 GMT
Title: Adversarial Style Augmentation via Large Language Model for Robust Fake News Detection
Authors: Sungwon Park, Sungwon Han, Meeyoung Cha,
Abstract summary: This study proposes adversarial style augmentation, AdStyle, to train a fake news detector. Our model's key mechanism is the careful use of LLMs to automatically generate a diverse yet coherent range of style-conversion attack prompts. Experiments show that our augmentation strategy improves robustness and detection performance when tested on fake news benchmark datasets.
Score: 18.998947450697337
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The spread of fake news negatively impacts individuals and is regarded as a significant social challenge that needs to be addressed. A number of algorithmic and insightful features have been identified for detecting fake news. However, with the recent LLMs and their advanced generation capabilities, many of the detectable features (e.g., style-conversion attacks) can be altered, making it more challenging to distinguish from real news. This study proposes adversarial style augmentation, AdStyle, to train a fake news detector that remains robust against various style-conversion attacks. Our model's key mechanism is the careful use of LLMs to automatically generate a diverse yet coherent range of style-conversion attack prompts. This improves the generation of prompts that are particularly difficult for the detector to handle. Experiments show that our augmentation strategy improves robustness and detection performance when tested on fake news benchmark datasets.

Related papers

Detection of Human and Machine-Authored Fake News in Urdu [2.013675429941823]
Social media has amplified the spread of fake news. Traditional fake news detection methods relying on linguistic cues become less effective. We propose a hierarchical detection strategy to improve the accuracy and robustness.
arXiv Detail & Related papers (2024-10-25T12:42:07Z)
Real-time Fake News from Adversarial Feedback [11.742257531343814]
We show that evaluations for fake news detection based on conventional sources, such as claims on fact-checking websites, result in an increasing accuracy over time for LLM-based detectors. This suggests that popular political claims, which form the majority of fake news on such sources, are easily classified using surface-level shallow patterns. We develop a novel pipeline that leverages natural language feedback from a RAG-based detector to iteratively modify real-time news into fake news.
arXiv Detail & Related papers (2024-10-18T17:47:11Z)
Detecting Machine-Generated Long-Form Content with Latent-Space Variables [54.07946647012579]
Existing zero-shot detectors primarily focus on token-level distributions, which are vulnerable to real-world domain shifts. We propose a more robust method that incorporates abstract elements, such as event transitions, as key deciding factors to detect machine versus human texts.
arXiv Detail & Related papers (2024-10-04T18:42:09Z)
Dynamic Analysis and Adaptive Discriminator for Fake News Detection [59.41431561403343]
We propose a Dynamic Analysis and Adaptive Discriminator (DAAD) approach for fake news detection. For knowledge-based methods, we introduce the Monte Carlo Tree Search algorithm to leverage the self-reflective capabilities of large language models. For semantic-based methods, we define four typical deceit patterns to reveal the mechanisms behind fake news creation.
arXiv Detail & Related papers (2024-08-20T14:13:54Z)
Exploring the Deceptive Power of LLM-Generated Fake News: A Study of Real-World Detection Challenges [21.425647152424585]
We propose a strong fake news attack method called conditional Variational-autoencoder-Like Prompt (VLPrompt) Unlike current methods, VLPrompt eliminates the need for additional data collection while maintaining contextual coherence. Our experiments, including various detection methods and novel human study metrics, were conducted to assess their performance on our dataset.
arXiv Detail & Related papers (2024-03-27T04:39:18Z)
MSynFD: Multi-hop Syntax aware Fake News Detection [27.046529059563863]
Social media platforms have fueled the rapid dissemination of fake news, posing threats to our real-life society. Existing methods use multimodal data or contextual information to enhance the detection of fake news. We propose a novel multi-hop syntax aware fake news detection (MSynFD) method, which incorporates complementary syntax information to deal with subtle twists in fake news.
arXiv Detail & Related papers (2024-02-18T05:40:33Z)
Adapting Fake News Detection to the Era of Large Language Models [48.5847914481222]
We study the interplay between machine-(paraphrased) real news, machine-generated fake news, human-written fake news, and human-written real news. Our experiments reveal an interesting pattern that detectors trained exclusively on human-written articles can indeed perform well at detecting machine-generated fake news, but not vice versa.
arXiv Detail & Related papers (2023-11-02T08:39:45Z)
Fake News in Sheep's Clothing: Robust Fake News Detection Against LLM-Empowered Style Attacks [60.14025705964573]
SheepDog is a style-robust fake news detector that prioritizes content over style in determining news veracity. SheepDog achieves this resilience through (1) LLM-empowered news reframings that inject style diversity into the training process by customizing articles to match different styles; (2) a style-agnostic training scheme that ensures consistent veracity predictions across style-diverse reframings; and (3) content-focused attributions that distill content-centric guidelines from LLMs for debunking fake news.
arXiv Detail & Related papers (2023-10-16T21:05:12Z)
MGTBench: Benchmarking Machine-Generated Text Detection [54.81446366272403]
This paper proposes the first benchmark framework for MGT detection against powerful large language models (LLMs) We show that a larger number of words in general leads to better performance and most detection methods can achieve similar performance with much fewer training samples. Our findings indicate that the model-based detection methods still perform well in the text attribution task.
arXiv Detail & Related papers (2023-03-26T21:12:36Z)
Verifying the Robustness of Automatic Credibility Assessment [50.55687778699995]
We show that meaning-preserving changes in input text can mislead the models. We also introduce BODEGA: a benchmark for testing both victim models and attack methods on misinformation detection tasks. Our experimental results show that modern large language models are often more vulnerable to attacks than previous, smaller solutions.
arXiv Detail & Related papers (2023-03-14T16:11:47Z)
Multiverse: Multilingual Evidence for Fake News Detection [71.51905606492376]
Multiverse is a new feature based on multilingual evidence that can be used for fake news detection. The hypothesis of the usage of cross-lingual evidence as a feature for fake news detection is confirmed.
arXiv Detail & Related papers (2022-11-25T18:24:17Z)
A Multi-Policy Framework for Deep Learning-Based Fake News Detection [0.31498833540989407]
This work introduces Multi-Policy Statement Checker (MPSC), a framework that automates fake news detection. MPSC uses deep learning techniques to analyze a statement itself and its related news articles, predicting whether it is seemingly credible or suspicious.
arXiv Detail & Related papers (2022-06-01T21:25:21Z)
"That Is a Suspicious Reaction!": Interpreting Logits Variation to Detect NLP Adversarial Attacks [0.2999888908665659]
Adversarial attacks are a major challenge faced by current machine learning research. Our work presents a model-agnostic detector of adversarial text examples.
arXiv Detail & Related papers (2022-04-10T09:24:41Z)
Adversarial Stylometry in the Wild: Transferable Lexical Substitution Attacks on Author Profiling [13.722693312120462]
Adversarial stylometry intends to attack such models by rewriting an author's text. Our research proposes several components to facilitate deployment of these adversarial attacks in the wild.
arXiv Detail & Related papers (2021-01-27T10:42:44Z)

This list is automatically generated from the titles and abstracts of the papers in this site.