Adversarial Attacks Against Automated Fact-Checking: A Survey
- URL: http://arxiv.org/abs/2509.08463v1
- Date: Wed, 10 Sep 2025 10:10:10 GMT
- Title: Adversarial Attacks Against Automated Fact-Checking: A Survey
- Authors: Fanzhen Liu, Alsharif Abuadbba, Kristen Moore, Surya Nepal, Cecile Paris, Jia Wu, Jian Yang, Quan Z. Sheng,
- Abstract summary: This survey provides the first in-depth review of adversarial attacks targeting fact-checking systems.<n>We examine recent advancements in adversary-aware defenses and highlight open research questions.<n>Our findings underscore the urgent need for resilient FC frameworks capable of withstanding adversarial manipulations.
- Score: 36.08022268176274
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In an era where misinformation spreads freely, fact-checking (FC) plays a crucial role in verifying claims and promoting reliable information. While automated fact-checking (AFC) has advanced significantly, existing systems remain vulnerable to adversarial attacks that manipulate or generate claims, evidence, or claim-evidence pairs. These attacks can distort the truth, mislead decision-makers, and ultimately undermine the reliability of FC models. Despite growing research interest in adversarial attacks against AFC systems, a comprehensive, holistic overview of key challenges remains lacking. These challenges include understanding attack strategies, assessing the resilience of current models, and identifying ways to enhance robustness. This survey provides the first in-depth review of adversarial attacks targeting FC, categorizing existing attack methodologies and evaluating their impact on AFC systems. Additionally, we examine recent advancements in adversary-aware defenses and highlight open research questions that require further exploration. Our findings underscore the urgent need for resilient FC frameworks capable of withstanding adversarial manipulations in pursuit of preserving high verification accuracy.
Related papers
- DECEIVE-AFC: Adversarial Claim Attacks against Search-Enabled LLM-based Fact-Checking Systems [38.6944646666426]
We study adversarial claim attacks against search-enabled fact-checking systems under a realistic input-only threat model.<n>We propose DECEIVE-AFC, an agent-based adversarial attack framework that integrates novel claim-level attack strategies and adversarial claim validity evaluation principles.<n>Our attacks substantially degrade verification performance, reducing accuracy from 78.7% to 53.7%, and significantly outperform existing claim-based attack baselines with strong cross-system transferability.
arXiv Detail & Related papers (2026-01-31T03:49:23Z) - LLM-Based Adversarial Persuasion Attacks on Fact-Checking Systems [9.795192821776462]
We introduce a novel class of persuasive adversarial attacks on automated fact-checking systems.<n>We study the effects of persuasion on both claim verification and evidence retrieval using a decoupled evaluation strategy.<n>Our analysis identifies persuasion techniques as a potent class of adversarial attacks, highlighting the need for more robust AFC systems.
arXiv Detail & Related papers (2026-01-23T16:57:16Z) - Benchmarking Misuse Mitigation Against Covert Adversaries [80.74502950627736]
Existing language model safety evaluations focus on overt attacks and low-stakes tasks.<n>We develop Benchmarks for Stateful Defenses (BSD), a data generation pipeline that automates evaluations of covert attacks and corresponding defenses.<n>Our evaluations indicate that decomposition attacks are effective misuse enablers, and highlight stateful defenses as a countermeasure.
arXiv Detail & Related papers (2025-06-06T17:33:33Z) - Slot: Provenance-Driven APT Detection through Graph Reinforcement Learning [24.84110719035862]
Advanced Persistent Threats (APTs) represent sophisticated cyberattacks characterized by their ability to remain undetected for extended periods.<n>We propose Slot, an advanced APT detection approach based on provenance graphs and graph reinforcement learning.<n>We show Slot's outstanding accuracy, efficiency, adaptability, and robustness in APT detection, with most metrics surpassing state-of-the-art methods.
arXiv Detail & Related papers (2024-10-23T14:28:32Z) - A Survey and Evaluation of Adversarial Attacks for Object Detection [11.48212060875543]
Deep learning models are vulnerable to adversarial examples that can deceive them into making confident but incorrect predictions.<n>This vulnerability pose significant risks in high-stakes applications such as autonomous vehicles, security surveillance, and safety-critical inspection systems.<n>This paper presents a novel taxonomic framework for categorizing adversarial attacks specific to object detection architectures.
arXiv Detail & Related papers (2024-08-04T05:22:08Z) - Rethinking the Vulnerabilities of Face Recognition Systems:From a Practical Perspective [53.24281798458074]
Face Recognition Systems (FRS) have increasingly integrated into critical applications, including surveillance and user authentication.
Recent studies have revealed vulnerabilities in FRS to adversarial (e.g., adversarial patch attacks) and backdoor attacks (e.g., training data poisoning)
arXiv Detail & Related papers (2024-05-21T13:34:23Z) - Kick Bad Guys Out! Conditionally Activated Anomaly Detection in Federated Learning with Zero-Knowledge Proof Verification [22.078088272837068]
Federated Learning (FL) systems are vulnerable to adversarial attacks, such as model poisoning and backdoor attacks.<n>We propose a novel anomaly detection method designed specifically for practical FL scenarios.<n>Our approach employs a two-stage, conditionally activated detection mechanism.
arXiv Detail & Related papers (2023-10-06T07:09:05Z) - Defense of Adversarial Ranking Attack in Text Retrieval: Benchmark and
Baseline via Detection [12.244543468021938]
This paper introduces two types of detection tasks for adversarial documents.
A benchmark dataset is established to facilitate the investigation of adversarial ranking defense.
A comprehensive investigation of the performance of several detection baselines is conducted.
arXiv Detail & Related papers (2023-07-31T16:31:24Z) - Avoid Adversarial Adaption in Federated Learning by Multi-Metric
Investigations [55.2480439325792]
Federated Learning (FL) facilitates decentralized machine learning model training, preserving data privacy, lowering communication costs, and boosting model performance through diversified data sources.
FL faces vulnerabilities such as poisoning attacks, undermining model integrity with both untargeted performance degradation and targeted backdoor attacks.
We define a new notion of strong adaptive adversaries, capable of adapting to multiple objectives simultaneously.
MESAS is the first defense robust against strong adaptive adversaries, effective in real-world data scenarios, with an average overhead of just 24.37 seconds.
arXiv Detail & Related papers (2023-06-06T11:44:42Z) - Adversarial Attacks and Defenses in Machine Learning-Powered Networks: A
Contemporary Survey [114.17568992164303]
Adrial attacks and defenses in machine learning and deep neural network have been gaining significant attention.
This survey provides a comprehensive overview of the recent advancements in the field of adversarial attack and defense techniques.
New avenues of attack are also explored, including search-based, decision-based, drop-based, and physical-world attacks.
arXiv Detail & Related papers (2023-03-11T04:19:31Z) - Fact-Saboteurs: A Taxonomy of Evidence Manipulation Attacks against
Fact-Verification Systems [80.3811072650087]
We show that it is possible to subtly modify claim-salient snippets in the evidence and generate diverse and claim-aligned evidence.
The attacks are also robust against post-hoc modifications of the claim.
These attacks can have harmful implications on the inspectable and human-in-the-loop usage scenarios.
arXiv Detail & Related papers (2022-09-07T13:39:24Z) - Adversarial Attacks against Face Recognition: A Comprehensive Study [3.766020696203255]
Face recognition (FR) systems have demonstrated outstanding verification performance.
Recent studies show that (deep) FR systems exhibit an intriguing vulnerability to imperceptible or perceptible but natural-looking adversarial input images.
arXiv Detail & Related papers (2020-07-22T22:46:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.