Detecting Cross-Modal Inconsistency to Defend Against Neural Fake News
- URL: http://arxiv.org/abs/2009.07698v5
- Date: Wed, 21 Oct 2020 15:16:20 GMT
- Title: Detecting Cross-Modal Inconsistency to Defend Against Neural Fake News
- Authors: Reuben Tan, Bryan A. Plummer, Kate Saenko
- Abstract summary: We introduce the more realistic and challenging task of defending against machine-generated news that also includes images and captions.
To identify the possible weaknesses that adversaries can exploit, we create a NeuralNews dataset composed of 4 different types of generated articles.
In addition to the valuable insights gleaned from our user study experiments, we provide a relatively effective approach based on detecting visual-semantic inconsistencies.
- Score: 57.9843300852526
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large-scale dissemination of disinformation online intended to mislead or
deceive the general population is a major societal problem. Rapid progression
in image, video, and natural language generative models has only exacerbated
this situation and intensified our need for an effective defense mechanism.
While existing approaches have been proposed to defend against neural fake
news, they are generally constrained to the very limited setting where articles
only have text and metadata such as the title and authors. In this paper, we
introduce the more realistic and challenging task of defending against
machine-generated news that also includes images and captions. To identify the
possible weaknesses that adversaries can exploit, we create a NeuralNews
dataset composed of 4 different types of generated articles as well as conduct
a series of human user study experiments based on this dataset. In addition to
the valuable insights gleaned from our user study experiments, we provide a
relatively effective approach based on detecting visual-semantic
inconsistencies, which will serve as an effective first line of defense and a
useful reference for future work in defending against machine-generated
disinformation.
Related papers
- Humanizing Machine-Generated Content: Evading AI-Text Detection through Adversarial Attack [24.954755569786396]
We propose a framework for a broader class of adversarial attacks, designed to perform minor perturbations in machine-generated content to evade detection.
We consider two attack settings: white-box and black-box, and employ adversarial learning in dynamic scenarios to assess the potential enhancement of the current detection model's robustness.
The empirical results reveal that the current detection models can be compromised in as little as 10 seconds, leading to the misclassification of machine-generated text as human-written content.
arXiv Detail & Related papers (2024-04-02T12:49:22Z) - Few-Shot Adversarial Prompt Learning on Vision-Language Models [62.50622628004134]
The vulnerability of deep neural networks to imperceptible adversarial perturbations has attracted widespread attention.
Previous efforts achieved zero-shot adversarial robustness by aligning adversarial visual features with text supervision.
We propose a few-shot adversarial prompt framework where adapting input sequences with limited data makes significant adversarial robustness improvement.
arXiv Detail & Related papers (2024-03-21T18:28:43Z) - Avoid Adversarial Adaption in Federated Learning by Multi-Metric
Investigations [55.2480439325792]
Federated Learning (FL) facilitates decentralized machine learning model training, preserving data privacy, lowering communication costs, and boosting model performance through diversified data sources.
FL faces vulnerabilities such as poisoning attacks, undermining model integrity with both untargeted performance degradation and targeted backdoor attacks.
We define a new notion of strong adaptive adversaries, capable of adapting to multiple objectives simultaneously.
MESAS is the first defense robust against strong adaptive adversaries, effective in real-world data scenarios, with an average overhead of just 24.37 seconds.
arXiv Detail & Related papers (2023-06-06T11:44:42Z) - Verifying the Robustness of Automatic Credibility Assessment [50.55687778699995]
We show that meaning-preserving changes in input text can mislead the models.
We also introduce BODEGA: a benchmark for testing both victim models and attack methods on misinformation detection tasks.
Our experimental results show that modern large language models are often more vulnerable to attacks than previous, smaller solutions.
arXiv Detail & Related papers (2023-03-14T16:11:47Z) - User-Centered Security in Natural Language Processing [0.7106986689736825]
dissertation proposes a framework of user-centered security in Natural Language Processing (NLP)
It focuses on two security domains within NLP with great public interest.
arXiv Detail & Related papers (2023-01-10T22:34:19Z) - Adversarial Contrastive Learning for Evidence-aware Fake News Detection
with Graph Neural Networks [20.282527436527765]
We propose a unified Graph-based sEmantic structure mining framework with ConTRAstive Learning, namely GETRAL in short.
We first model claims and evidences as graph-structured data to capture the long-distance semantic dependency.
Then the fine-grained semantic representations are fed into the claim-evidence interaction module for predictions.
arXiv Detail & Related papers (2022-10-11T14:54:37Z) - Faking Fake News for Real Fake News Detection: Propaganda-loaded
Training Data Generation [105.20743048379387]
We propose a novel framework for generating training examples informed by the known styles and strategies of human-authored propaganda.
Specifically, we perform self-critical sequence training guided by natural language inference to ensure the validity of the generated articles.
Our experimental results show that fake news detectors trained on PropaNews are better at detecting human-written disinformation by 3.62 - 7.69% F1 score on two public datasets.
arXiv Detail & Related papers (2022-03-10T14:24:19Z) - Adversarial Robustness of Neural-Statistical Features in Detection of
Generative Transformers [6.209131728799896]
We evaluate neural and non-neural approaches on their ability to detect computer-generated text.
We find that while statistical features underperform neural features, statistical features provide additional adversarial robustness.
We pioneer the usage of $Delta$MAUVE as a proxy measure for human judgement of adversarial text quality.
arXiv Detail & Related papers (2022-03-02T16:46:39Z) - Mining Fine-grained Semantics via Graph Neural Networks for
Evidence-based Fake News Detection [20.282527436527765]
We propose a unified Graph-based sEmantic sTructure mining framework, namely GET in short.
We model claims and evidences as graph-structured data and capture the long-distance semantic dependency.
After obtaining contextual semantic information, our model reduces information redundancy by performing graph structure learning.
arXiv Detail & Related papers (2022-01-18T11:28:36Z) - Searching for an Effective Defender: Benchmarking Defense against
Adversarial Word Substitution [83.84968082791444]
Deep neural networks are vulnerable to intentionally crafted adversarial examples.
Various methods have been proposed to defend against adversarial word-substitution attacks for neural NLP models.
arXiv Detail & Related papers (2021-08-29T08:11:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.