Data Poisoning Attacks on Regression Learning and Corresponding Defenses
- URL: http://arxiv.org/abs/2009.07008v1
- Date: Tue, 15 Sep 2020 12:14:54 GMT
- Title: Data Poisoning Attacks on Regression Learning and Corresponding Defenses
- Authors: Nicolas Michael M\"uller, Daniel Kowatsch, Konstantin B\"ottinger
- Abstract summary: Adversarial data poisoning is an effective attack against machine learning and threatens model integrity by introducing poisoned data into the training dataset.
We present realistic scenarios in which data poisoning attacks threaten production systems and introduce a novel black-box attack.
As a result, we observe that the mean squared error (MSE) of the regressor increases to 150 percent due to inserting only two percent of poison samples.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Adversarial data poisoning is an effective attack against machine learning
and threatens model integrity by introducing poisoned data into the training
dataset. So far, it has been studied mostly for classification, even though
regression learning is used in many mission critical systems (such as dosage of
medication, control of cyber-physical systems and managing power supply).
Therefore, in the present research, we aim to evaluate all aspects of data
poisoning attacks on regression learning, exceeding previous work both in terms
of breadth and depth. We present realistic scenarios in which data poisoning
attacks threaten production systems and introduce a novel black-box attack,
which is then applied to a real-word medical use-case. As a result, we observe
that the mean squared error (MSE) of the regressor increases to 150 percent due
to inserting only two percent of poison samples. Finally, we present a new
defense strategy against the novel and previous attacks and evaluate it
thoroughly on 26 datasets. As a result of the conducted experiments, we
conclude that the proposed defence strategy effectively mitigates the
considered attacks.
Related papers
- On the Adversarial Risk of Test Time Adaptation: An Investigation into Realistic Test-Time Data Poisoning [49.17494657762375]
Test-time adaptation (TTA) updates the model weights during the inference stage using testing data to enhance generalization.
Existing studies have shown that when TTA is updated with crafted adversarial test samples, the performance on benign samples can deteriorate.
We propose an effective and realistic attack method that better produces poisoned samples without access to benign samples.
arXiv Detail & Related papers (2024-10-07T01:29:19Z) - Have You Poisoned My Data? Defending Neural Networks against Data Poisoning [0.393259574660092]
We propose a novel approach to detect and filter poisoned datapoints in the transfer learning setting.
We show that effective poisons can be successfully differentiated from clean points in the characteristic vector space.
Our evaluation shows that our proposal outperforms existing approaches in defense rate and final trained model performance.
arXiv Detail & Related papers (2024-03-20T11:50:16Z) - On Practical Aspects of Aggregation Defenses against Data Poisoning
Attacks [58.718697580177356]
Attacks on deep learning models with malicious training samples are known as data poisoning.
Recent advances in defense strategies against data poisoning have highlighted the effectiveness of aggregation schemes in achieving certified poisoning robustness.
Here we focus on Deep Partition Aggregation, a representative aggregation defense, and assess its practical aspects, including efficiency, performance, and robustness.
arXiv Detail & Related papers (2023-06-28T17:59:35Z) - Exploring Model Dynamics for Accumulative Poisoning Discovery [62.08553134316483]
We propose a novel information measure, namely, Memorization Discrepancy, to explore the defense via the model-level information.
By implicitly transferring the changes in the data manipulation to that in the model outputs, Memorization Discrepancy can discover the imperceptible poison samples.
We thoroughly explore its properties and propose Discrepancy-aware Sample Correction (DSC) to defend against accumulative poisoning attacks.
arXiv Detail & Related papers (2023-06-06T14:45:24Z) - Autoregressive Perturbations for Data Poisoning [54.205200221427994]
Data scraping from social media has led to growing concerns regarding unauthorized use of data.
Data poisoning attacks have been proposed as a bulwark against scraping.
We introduce autoregressive (AR) poisoning, a method that can generate poisoned data without access to the broader dataset.
arXiv Detail & Related papers (2022-06-08T06:24:51Z) - Wild Patterns Reloaded: A Survey of Machine Learning Security against
Training Data Poisoning [32.976199681542845]
We provide a comprehensive systematization of poisoning attacks and defenses in machine learning.
We start by categorizing the current threat models and attacks, and then organize existing defenses accordingly.
We argue that our systematization also encompasses state-of-the-art attacks and defenses for other data modalities.
arXiv Detail & Related papers (2022-05-04T11:00:26Z) - Accumulative Poisoning Attacks on Real-time Data [56.96241557830253]
We show that a well-designed but straightforward attacking strategy can dramatically amplify the poisoning effects.
Our work validates that a well-designed but straightforward attacking strategy can dramatically amplify the poisoning effects.
arXiv Detail & Related papers (2021-06-18T08:29:53Z) - Defening against Adversarial Denial-of-Service Attacks [0.0]
Data poisoning is one of the most relevant security threats against machine learning and data-driven technologies.
We propose a new approach of detecting DoS poisoned instances.
We evaluate our defence against two DoS poisoning attacks and seven datasets, and find that it reliably identifies poisoned instances.
arXiv Detail & Related papers (2021-04-14T09:52:36Z) - Property Inference From Poisoning [15.105224455937025]
Property inference attacks consider an adversary who has access to the trained model and tries to extract some global statistics of the training data.
We study poisoning attacks where the goal of the adversary is to increase the information leakage of the model.
Our findings suggest that poisoning attacks can boost the information leakage significantly and should be considered as a stronger threat model in sensitive applications.
arXiv Detail & Related papers (2021-01-26T20:35:28Z) - Witches' Brew: Industrial Scale Data Poisoning via Gradient Matching [56.280018325419896]
Data Poisoning attacks modify training data to maliciously control a model trained on such data.
We analyze a particularly malicious poisoning attack that is both "from scratch" and "clean label"
We show that it is the first poisoning method to cause targeted misclassification in modern deep networks trained from scratch on a full-sized, poisoned ImageNet dataset.
arXiv Detail & Related papers (2020-09-04T16:17:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.