Poison is Not Traceless: Fully-Agnostic Detection of Poisoning Attacks
- URL: http://arxiv.org/abs/2310.16224v1
- Date: Tue, 24 Oct 2023 22:27:44 GMT
- Title: Poison is Not Traceless: Fully-Agnostic Detection of Poisoning Attacks
- Authors: Xinglong Chang, Katharina Dost, Gillian Dobbie, J\"org Wicker
- Abstract summary: This paper presents a novel fully-agnostic framework, DIVA, that detects attacks solely relying on analyzing the potentially poisoned data set.
For evaluation purposes, in this paper, we test DIVA on label-flipping attacks.
- Score: 4.064462548421468
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: The performance of machine learning models depends on the quality of the
underlying data. Malicious actors can attack the model by poisoning the
training data. Current detectors are tied to either specific data types,
models, or attacks, and therefore have limited applicability in real-world
scenarios. This paper presents a novel fully-agnostic framework, DIVA
(Detecting InVisible Attacks), that detects attacks solely relying on analyzing
the potentially poisoned data set. DIVA is based on the idea that poisoning
attacks can be detected by comparing the classifier's accuracy on poisoned and
clean data and pre-trains a meta-learner using Complexity Measures to estimate
the otherwise unknown accuracy on a hypothetical clean dataset. The framework
applies to generic poisoning attacks. For evaluation purposes, in this paper,
we test DIVA on label-flipping attacks.
Related papers
- Unlearnable Examples Detection via Iterative Filtering [84.59070204221366]
Deep neural networks are proven to be vulnerable to data poisoning attacks.
It is quite beneficial and challenging to detect poisoned samples from a mixed dataset.
We propose an Iterative Filtering approach for UEs identification.
arXiv Detail & Related papers (2024-08-15T13:26:13Z) - Exploring Model Dynamics for Accumulative Poisoning Discovery [62.08553134316483]
We propose a novel information measure, namely, Memorization Discrepancy, to explore the defense via the model-level information.
By implicitly transferring the changes in the data manipulation to that in the model outputs, Memorization Discrepancy can discover the imperceptible poison samples.
We thoroughly explore its properties and propose Discrepancy-aware Sample Correction (DSC) to defend against accumulative poisoning attacks.
arXiv Detail & Related papers (2023-06-06T14:45:24Z) - Exploring the Limits of Model-Targeted Indiscriminate Data Poisoning
Attacks [31.339252233416477]
We introduce the notion of model poisoning reachability as a technical tool to explore the intrinsic limits of data poisoning attacks towards target parameters.
We derive an easily computable threshold to establish and quantify a surprising phase transition phenomenon among popular ML models.
Our work highlights the critical role played by the poisoning ratio, and sheds new insights on existing empirical results, attacks and mitigation strategies in data poisoning.
arXiv Detail & Related papers (2023-03-07T01:55:26Z) - Temporal Robustness against Data Poisoning [69.01705108817785]
Data poisoning considers cases when an adversary manipulates the behavior of machine learning algorithms through malicious training data.
We propose a temporal threat model of data poisoning with two novel metrics, earliness and duration, which respectively measure how long an attack started in advance and how long an attack lasted.
arXiv Detail & Related papers (2023-02-07T18:59:19Z) - Using Anomaly Detection to Detect Poisoning Attacks in Federated
Learning Applications [2.978389704820221]
Adversarial attacks such as poisoning attacks have attracted the attention of many machine learning researchers.
Traditionally, poisoning attacks attempt to inject adversarial training data in order to manipulate the trained model.
In federated learning (FL), data poisoning attacks can be generalized to model poisoning attacks, which cannot be detected by simpler methods due to the lack of access to local training data by the detector.
We propose a novel framework for detecting poisoning attacks in FL, which employs a reference model based on a public dataset and an auditor model to detect malicious updates.
arXiv Detail & Related papers (2022-07-18T10:10:45Z) - Accumulative Poisoning Attacks on Real-time Data [56.96241557830253]
We show that a well-designed but straightforward attacking strategy can dramatically amplify the poisoning effects.
Our work validates that a well-designed but straightforward attacking strategy can dramatically amplify the poisoning effects.
arXiv Detail & Related papers (2021-06-18T08:29:53Z) - De-Pois: An Attack-Agnostic Defense against Data Poisoning Attacks [17.646155241759743]
De-Pois is an attack-agnostic defense against poisoning attacks.
We implement four types of poisoning attacks and evaluate De-Pois with five typical defense methods.
arXiv Detail & Related papers (2021-05-08T04:47:37Z) - How Robust are Randomized Smoothing based Defenses to Data Poisoning? [66.80663779176979]
We present a previously unrecognized threat to robust machine learning models that highlights the importance of training-data quality.
We propose a novel bilevel optimization-based data poisoning attack that degrades the robustness guarantees of certifiably robust classifiers.
Our attack is effective even when the victim trains the models from scratch using state-of-the-art robust training methods.
arXiv Detail & Related papers (2020-12-02T15:30:21Z) - Just How Toxic is Data Poisoning? A Unified Benchmark for Backdoor and
Data Poisoning Attacks [74.88735178536159]
Data poisoning is the number one concern among threats ranging from model stealing to adversarial attacks.
We observe that data poisoning and backdoor attacks are highly sensitive to variations in the testing setup.
We apply rigorous tests to determine the extent to which we should fear them.
arXiv Detail & Related papers (2020-06-22T18:34:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.