Dataset Security for Machine Learning: Data Poisoning, Backdoor Attacks,
and Defenses
- URL: http://arxiv.org/abs/2012.10544v4
- Date: Wed, 31 Mar 2021 22:21:34 GMT
- Title: Dataset Security for Machine Learning: Data Poisoning, Backdoor Attacks,
and Defenses
- Authors: Micah Goldblum, Dimitris Tsipras, Chulin Xie, Xinyun Chen, Avi
Schwarzschild, Dawn Song, Aleksander Madry, Bo Li, Tom Goldstein
- Abstract summary: This work systematically categorizes and discusses a wide range of dataset vulnerabilities and exploits.
In addition to describing various poisoning and backdoor threat models and the relationships among them, we develop their unified taxonomy.
- Score: 150.64470864162556
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: As machine learning systems grow in scale, so do their training data
requirements, forcing practitioners to automate and outsource the curation of
training data in order to achieve state-of-the-art performance. The absence of
trustworthy human supervision over the data collection process exposes
organizations to security vulnerabilities; training data can be manipulated to
control and degrade the downstream behaviors of learned models. The goal of
this work is to systematically categorize and discuss a wide range of dataset
vulnerabilities and exploits, approaches for defending against these threats,
and an array of open problems in this space. In addition to describing various
poisoning and backdoor threat models and the relationships among them, we
develop their unified taxonomy.
Related papers
- Privacy Backdoors: Enhancing Membership Inference through Poisoning Pre-trained Models [112.48136829374741]
In this paper, we unveil a new vulnerability: the privacy backdoor attack.
When a victim fine-tunes a backdoored model, their training data will be leaked at a significantly higher rate than if they had fine-tuned a typical model.
Our findings highlight a critical privacy concern within the machine learning community and call for a reevaluation of safety protocols in the use of open-source pre-trained models.
arXiv Detail & Related papers (2024-04-01T16:50:54Z) - The Frontier of Data Erasure: Machine Unlearning for Large Language Models [56.26002631481726]
Large Language Models (LLMs) are foundational to AI advancements.
LLMs pose risks by potentially memorizing and disseminating sensitive, biased, or copyrighted information.
Machine unlearning emerges as a cutting-edge solution to mitigate these concerns.
arXiv Detail & Related papers (2024-03-23T09:26:15Z) - Threats, Attacks, and Defenses in Machine Unlearning: A Survey [15.05662521329346]
Machine Unlearning (MU) has gained considerable attention recently for its potential to achieve Safe AI.
This survey aims to fill the gap between the extensive number of studies on threats, attacks, and defenses in machine unlearning.
arXiv Detail & Related papers (2024-03-20T15:40:18Z) - Data and Model Poisoning Backdoor Attacks on Wireless Federated
Learning, and the Defense Mechanisms: A Comprehensive Survey [28.88186038735176]
Federated Learning (FL) has been increasingly considered for applications to wireless communication networks (WCNs)
In general, non-independent and identically distributed (non-IID) data of WCNs raises concerns about robustness.
This survey provides a comprehensive review of the latest backdoor attacks and defense mechanisms.
arXiv Detail & Related papers (2023-12-14T05:52:29Z) - FedDefender: Client-Side Attack-Tolerant Federated Learning [60.576073964874]
Federated learning enables learning from decentralized data sources without compromising privacy.
It is vulnerable to model poisoning attacks, where malicious clients interfere with the training process.
We propose a new defense mechanism that focuses on the client-side, called FedDefender, to help benign clients train robust local models.
arXiv Detail & Related papers (2023-07-18T08:00:41Z) - Instructions as Backdoors: Backdoor Vulnerabilities of Instruction Tuning for Large Language Models [53.416234157608]
We investigate security concerns of the emergent instruction tuning paradigm, that models are trained on crowdsourced datasets with task instructions to achieve superior performance.
Our studies demonstrate that an attacker can inject backdoors by issuing very few malicious instructions and control model behavior through data poisoning.
arXiv Detail & Related papers (2023-05-24T04:27:21Z) - It Is All About Data: A Survey on the Effects of Data on Adversarial
Robustness [4.1310970179750015]
Adversarial examples are inputs to machine learning models that an attacker has intentionally designed to confuse the model into making a mistake.
To address this problem, the area of adversarial robustness investigates mechanisms behind adversarial attacks and defenses against these attacks.
arXiv Detail & Related papers (2023-03-17T04:18:03Z) - Wild Patterns Reloaded: A Survey of Machine Learning Security against
Training Data Poisoning [32.976199681542845]
We provide a comprehensive systematization of poisoning attacks and defenses in machine learning.
We start by categorizing the current threat models and attacks, and then organize existing defenses accordingly.
We argue that our systematization also encompasses state-of-the-art attacks and defenses for other data modalities.
arXiv Detail & Related papers (2022-05-04T11:00:26Z) - A Tutorial on Adversarial Learning Attacks and Countermeasures [0.0]
A machine learning model is capable of making highly accurate predictions without being explicitly programmed to do so.
adversarial learning attacks pose a serious security threat that greatly undermines further such systems.
This paper provides a detailed tutorial on the principles of adversarial learning, explains the different attack scenarios, and gives an in-depth insight into the state-of-art defense mechanisms against this rising threat.
arXiv Detail & Related papers (2022-02-21T17:14:45Z) - Adversarial defense for automatic speaker verification by cascaded
self-supervised learning models [101.42920161993455]
More and more malicious attackers attempt to launch adversarial attacks at automatic speaker verification (ASV) systems.
We propose a standard and attack-agnostic method based on cascaded self-supervised learning models to purify the adversarial perturbations.
Experimental results demonstrate that the proposed method achieves effective defense performance and can successfully counter adversarial attacks.
arXiv Detail & Related papers (2021-02-14T01:56:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.