Related papers: Wild Patterns Reloaded: A Survey of Machine Learning Security against Training Data Poisoning

Wild Patterns Reloaded: A Survey of Machine Learning Security against Training Data Poisoning

URL: http://arxiv.org/abs/2205.01992v1
Date: Wed, 4 May 2022 11:00:26 GMT
Title: Wild Patterns Reloaded: A Survey of Machine Learning Security against Training Data Poisoning
Authors: Antonio Emanuele Cin\`a, Kathrin Grosse, Ambra Demontis, Sebastiano Vascon, Werner Zellinger, Bernhard A. Moser, Alina Oprea, Battista Biggio, Marcello Pelillo, Fabio Roli
Abstract summary: We provide a comprehensive systematization of poisoning attacks and defenses in machine learning. We start by categorizing the current threat models and attacks, and then organize existing defenses accordingly. We argue that our systematization also encompasses state-of-the-art attacks and defenses for other data modalities.
Score: 32.976199681542845
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The success of machine learning is fueled by the increasing availability of computing power and large training datasets. The training data is used to learn new models or update existing ones, assuming that it is sufficiently representative of the data that will be encountered at test time. This assumption is challenged by the threat of poisoning, an attack that manipulates the training data to compromise the model's performance at test time. Although poisoning has been acknowledged as a relevant threat in industry applications, and a variety of different attacks and defenses have been proposed so far, a complete systematization and critical review of the field is still missing. In this survey, we provide a comprehensive systematization of poisoning attacks and defenses in machine learning, reviewing more than 200 papers published in the field in the last 15 years. We start by categorizing the current threat models and attacks, and then organize existing defenses accordingly. While we focus mostly on computer-vision applications, we argue that our systematization also encompasses state-of-the-art attacks and defenses for other data modalities. Finally, we discuss existing resources for research in poisoning, and shed light on the current limitations and open research questions in this research field.

Related papers

Data Poisoning in Deep Learning: A Survey [7.089830256473062]
Deep learning has become a cornerstone of modern artificial intelligence, enabling transformative applications across a wide range of domains. Deep learning models face the significant threat of data poisoning, where attackers introduce maliciously manipulated training data to degrade model accuracy or lead to anomalous behavior. This survey categorizes data poisoning attacks across multiple perspectives, providing an in-depth analysis of their characteristics and underlying design princinples.
arXiv Detail & Related papers (2025-03-27T15:16:57Z)
Safety at Scale: A Comprehensive Survey of Large Model Safety [298.05093528230753]
We present a comprehensive taxonomy of safety threats to large models, including adversarial attacks, data poisoning, backdoor attacks, jailbreak and prompt injection attacks, energy-latency attacks, data and model extraction attacks, and emerging agent-specific threats. We identify and discuss the open challenges in large model safety, emphasizing the need for comprehensive safety evaluations, scalable and effective defense mechanisms, and sustainable data practices.
arXiv Detail & Related papers (2025-02-02T05:14:22Z)
Model Inversion Attacks: A Survey of Approaches and Countermeasures [59.986922963781]
Recently, a new type of privacy attack, the model inversion attacks (MIAs), aims to extract sensitive features of private data for training. Despite the significance, there is a lack of systematic studies that provide a comprehensive overview and deeper insights into MIAs. This survey aims to summarize up-to-date MIA methods in both attacks and defenses.
arXiv Detail & Related papers (2024-11-15T08:09:28Z)
Data and Model Poisoning Backdoor Attacks on Wireless Federated Learning, and the Defense Mechanisms: A Comprehensive Survey [28.88186038735176]
Federated Learning (FL) has been increasingly considered for applications to wireless communication networks (WCNs) In general, non-independent and identically distributed (non-IID) data of WCNs raises concerns about robustness. This survey provides a comprehensive review of the latest backdoor attacks and defense mechanisms.
arXiv Detail & Related papers (2023-12-14T05:52:29Z)
Avoid Adversarial Adaption in Federated Learning by Multi-Metric Investigations [55.2480439325792]
Federated Learning (FL) facilitates decentralized machine learning model training, preserving data privacy, lowering communication costs, and boosting model performance through diversified data sources. FL faces vulnerabilities such as poisoning attacks, undermining model integrity with both untargeted performance degradation and targeted backdoor attacks. We define a new notion of strong adaptive adversaries, capable of adapting to multiple objectives simultaneously. MESAS is the first defense robust against strong adaptive adversaries, effective in real-world data scenarios, with an average overhead of just 24.37 seconds.
arXiv Detail & Related papers (2023-06-06T11:44:42Z)
Adversarial Attacks and Defenses in Machine Learning-Powered Networks: A Contemporary Survey [114.17568992164303]
Adrial attacks and defenses in machine learning and deep neural network have been gaining significant attention. This survey provides a comprehensive overview of the recent advancements in the field of adversarial attack and defense techniques. New avenues of attack are also explored, including search-based, decision-based, drop-based, and physical-world attacks.
arXiv Detail & Related papers (2023-03-11T04:19:31Z)
Temporal Robustness against Data Poisoning [69.01705108817785]
Data poisoning considers cases when an adversary manipulates the behavior of machine learning algorithms through malicious training data. We propose a temporal threat model of data poisoning with two novel metrics, earliness and duration, which respectively measure how long an attack started in advance and how long an attack lasted.
arXiv Detail & Related papers (2023-02-07T18:59:19Z)
Poisoning Attacks and Defenses on Artificial Intelligence: A Survey [3.706481388415728]
Data poisoning attacks represent a type of attack that consists of tampering the data samples fed to the model during the training phase, leading to a degradation in the models accuracy during the inference phase. This work compiles the most relevant insights and findings found in the latest existing literatures addressing this type of attacks. A thorough assessment is performed on the reviewed works, comparing the effects of data poisoning on a wide range of ML models in real-world conditions.
arXiv Detail & Related papers (2022-02-21T14:43:38Z)
Influence Based Defense Against Data Poisoning Attacks in Online Learning [9.414651358362391]
Data poisoning is an attack where an attacker manipulates a fraction of data to degrade the performance of machine learning model. We propose a defense mechanism to minimize the degradation caused by the poisoned training data on a learner's model in an online setup.
arXiv Detail & Related papers (2021-04-24T08:39:13Z)
Dataset Security for Machine Learning: Data Poisoning, Backdoor Attacks, and Defenses [150.64470864162556]
This work systematically categorizes and discusses a wide range of dataset vulnerabilities and exploits. In addition to describing various poisoning and backdoor threat models and the relationships among them, we develop their unified taxonomy.
arXiv Detail & Related papers (2020-12-18T22:38:47Z)
Data Poisoning Attacks on Regression Learning and Corresponding Defenses [0.0]
Adversarial data poisoning is an effective attack against machine learning and threatens model integrity by introducing poisoned data into the training dataset. We present realistic scenarios in which data poisoning attacks threaten production systems and introduce a novel black-box attack. As a result, we observe that the mean squared error (MSE) of the regressor increases to 150 percent due to inserting only two percent of poison samples.
arXiv Detail & Related papers (2020-09-15T12:14:54Z)
Witches' Brew: Industrial Scale Data Poisoning via Gradient Matching [56.280018325419896]
Data Poisoning attacks modify training data to maliciously control a model trained on such data. We analyze a particularly malicious poisoning attack that is both "from scratch" and "clean label" We show that it is the first poisoning method to cause targeted misclassification in modern deep networks trained from scratch on a full-sized, poisoned ImageNet dataset.
arXiv Detail & Related papers (2020-09-04T16:17:54Z)
With Great Dispersion Comes Greater Resilience: Efficient Poisoning Attacks and Defenses for Linear Regression Models [28.680562906669216]
We analyze how attackers may interfere with the results of regression learning by poisoning datasets. Our attack, termed Nopt, can produce larger errors with the same proportion of poisoning data-points. Our new defense algorithm, termed Proda, demonstrates an increased effectiveness in reducing errors.
arXiv Detail & Related papers (2020-06-21T22:36:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.