Wild Patterns Reloaded: A Survey of Machine Learning Security against
Training Data Poisoning
- URL: http://arxiv.org/abs/2205.01992v1
- Date: Wed, 4 May 2022 11:00:26 GMT
- Title: Wild Patterns Reloaded: A Survey of Machine Learning Security against
Training Data Poisoning
- Authors: Antonio Emanuele Cin\`a, Kathrin Grosse, Ambra Demontis, Sebastiano
Vascon, Werner Zellinger, Bernhard A. Moser, Alina Oprea, Battista Biggio,
Marcello Pelillo, Fabio Roli
- Abstract summary: We provide a comprehensive systematization of poisoning attacks and defenses in machine learning.
We start by categorizing the current threat models and attacks, and then organize existing defenses accordingly.
We argue that our systematization also encompasses state-of-the-art attacks and defenses for other data modalities.
- Score: 32.976199681542845
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The success of machine learning is fueled by the increasing availability of
computing power and large training datasets. The training data is used to learn
new models or update existing ones, assuming that it is sufficiently
representative of the data that will be encountered at test time. This
assumption is challenged by the threat of poisoning, an attack that manipulates
the training data to compromise the model's performance at test time. Although
poisoning has been acknowledged as a relevant threat in industry applications,
and a variety of different attacks and defenses have been proposed so far, a
complete systematization and critical review of the field is still missing. In
this survey, we provide a comprehensive systematization of poisoning attacks
and defenses in machine learning, reviewing more than 200 papers published in
the field in the last 15 years. We start by categorizing the current threat
models and attacks, and then organize existing defenses accordingly. While we
focus mostly on computer-vision applications, we argue that our systematization
also encompasses state-of-the-art attacks and defenses for other data
modalities. Finally, we discuss existing resources for research in poisoning,
and shed light on the current limitations and open research questions in this
research field.
Related papers
- Data and Model Poisoning Backdoor Attacks on Wireless Federated
Learning, and the Defense Mechanisms: A Comprehensive Survey [28.88186038735176]
Federated Learning (FL) has been increasingly considered for applications to wireless communication networks (WCNs)
In general, non-independent and identically distributed (non-IID) data of WCNs raises concerns about robustness.
This survey provides a comprehensive review of the latest backdoor attacks and defense mechanisms.
arXiv Detail & Related papers (2023-12-14T05:52:29Z) - Designing an attack-defense game: how to increase robustness of
financial transaction models via a competition [69.08339915577206]
Given the escalating risks of malicious attacks in the finance sector, understanding adversarial strategies and robust defense mechanisms for machine learning models is critical.
We aim to investigate the current state and dynamics of adversarial attacks and defenses for neural network models that use sequential financial data as the input.
We have designed a competition that allows realistic and detailed investigation of problems in modern financial transaction data.
The participants compete directly against each other, so possible attacks and defenses are examined in close-to-real-life conditions.
arXiv Detail & Related papers (2023-08-22T12:53:09Z) - Avoid Adversarial Adaption in Federated Learning by Multi-Metric
Investigations [55.2480439325792]
Federated Learning (FL) facilitates decentralized machine learning model training, preserving data privacy, lowering communication costs, and boosting model performance through diversified data sources.
FL faces vulnerabilities such as poisoning attacks, undermining model integrity with both untargeted performance degradation and targeted backdoor attacks.
We define a new notion of strong adaptive adversaries, capable of adapting to multiple objectives simultaneously.
MESAS is the first defense robust against strong adaptive adversaries, effective in real-world data scenarios, with an average overhead of just 24.37 seconds.
arXiv Detail & Related papers (2023-06-06T11:44:42Z) - Adversarial Attacks and Defenses in Machine Learning-Powered Networks: A
Contemporary Survey [114.17568992164303]
Adrial attacks and defenses in machine learning and deep neural network have been gaining significant attention.
This survey provides a comprehensive overview of the recent advancements in the field of adversarial attack and defense techniques.
New avenues of attack are also explored, including search-based, decision-based, drop-based, and physical-world attacks.
arXiv Detail & Related papers (2023-03-11T04:19:31Z) - Temporal Robustness against Data Poisoning [69.01705108817785]
Data poisoning considers cases when an adversary manipulates the behavior of machine learning algorithms through malicious training data.
We propose a temporal threat model of data poisoning with two novel metrics, earliness and duration, which respectively measure how long an attack started in advance and how long an attack lasted.
arXiv Detail & Related papers (2023-02-07T18:59:19Z) - Poisoning Attacks and Defenses on Artificial Intelligence: A Survey [3.706481388415728]
Data poisoning attacks represent a type of attack that consists of tampering the data samples fed to the model during the training phase, leading to a degradation in the models accuracy during the inference phase.
This work compiles the most relevant insights and findings found in the latest existing literatures addressing this type of attacks.
A thorough assessment is performed on the reviewed works, comparing the effects of data poisoning on a wide range of ML models in real-world conditions.
arXiv Detail & Related papers (2022-02-21T14:43:38Z) - Influence Based Defense Against Data Poisoning Attacks in Online
Learning [9.414651358362391]
Data poisoning is an attack where an attacker manipulates a fraction of data to degrade the performance of machine learning model.
We propose a defense mechanism to minimize the degradation caused by the poisoned training data on a learner's model in an online setup.
arXiv Detail & Related papers (2021-04-24T08:39:13Z) - Dataset Security for Machine Learning: Data Poisoning, Backdoor Attacks,
and Defenses [150.64470864162556]
This work systematically categorizes and discusses a wide range of dataset vulnerabilities and exploits.
In addition to describing various poisoning and backdoor threat models and the relationships among them, we develop their unified taxonomy.
arXiv Detail & Related papers (2020-12-18T22:38:47Z) - Data Poisoning Attacks on Regression Learning and Corresponding Defenses [0.0]
Adversarial data poisoning is an effective attack against machine learning and threatens model integrity by introducing poisoned data into the training dataset.
We present realistic scenarios in which data poisoning attacks threaten production systems and introduce a novel black-box attack.
As a result, we observe that the mean squared error (MSE) of the regressor increases to 150 percent due to inserting only two percent of poison samples.
arXiv Detail & Related papers (2020-09-15T12:14:54Z) - Witches' Brew: Industrial Scale Data Poisoning via Gradient Matching [56.280018325419896]
Data Poisoning attacks modify training data to maliciously control a model trained on such data.
We analyze a particularly malicious poisoning attack that is both "from scratch" and "clean label"
We show that it is the first poisoning method to cause targeted misclassification in modern deep networks trained from scratch on a full-sized, poisoned ImageNet dataset.
arXiv Detail & Related papers (2020-09-04T16:17:54Z) - With Great Dispersion Comes Greater Resilience: Efficient Poisoning
Attacks and Defenses for Linear Regression Models [28.680562906669216]
We analyze how attackers may interfere with the results of regression learning by poisoning datasets.
Our attack, termed Nopt, can produce larger errors with the same proportion of poisoning data-points.
Our new defense algorithm, termed Proda, demonstrates an increased effectiveness in reducing errors.
arXiv Detail & Related papers (2020-06-21T22:36:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.