Poisoning Attacks and Defenses on Artificial Intelligence: A Survey
- URL: http://arxiv.org/abs/2202.10276v2
- Date: Tue, 22 Feb 2022 07:20:43 GMT
- Title: Poisoning Attacks and Defenses on Artificial Intelligence: A Survey
- Authors: Miguel A. Ramirez, Song-Kyoo Kim, Hussam Al Hamadi, Ernesto Damiani,
Young-Ji Byon, Tae-Yeon Kim, Chung-Suk Cho and Chan Yeob Yeun
- Abstract summary: Data poisoning attacks represent a type of attack that consists of tampering the data samples fed to the model during the training phase, leading to a degradation in the models accuracy during the inference phase.
This work compiles the most relevant insights and findings found in the latest existing literatures addressing this type of attacks.
A thorough assessment is performed on the reviewed works, comparing the effects of data poisoning on a wide range of ML models in real-world conditions.
- Score: 3.706481388415728
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Machine learning models have been widely adopted in several fields. However,
most recent studies have shown several vulnerabilities from attacks with a
potential to jeopardize the integrity of the model, presenting a new window of
research opportunity in terms of cyber-security. This survey is conducted with
a main intention of highlighting the most relevant information related to
security vulnerabilities in the context of machine learning (ML) classifiers;
more specifically, directed towards training procedures against data poisoning
attacks, representing a type of attack that consists of tampering the data
samples fed to the model during the training phase, leading to a degradation in
the models accuracy during the inference phase. This work compiles the most
relevant insights and findings found in the latest existing literatures
addressing this type of attacks. Moreover, this paper also covers several
defense techniques that promise feasible detection and mitigation mechanisms,
capable of conferring a certain level of robustness to a target model against
an attacker. A thorough assessment is performed on the reviewed works,
comparing the effects of data poisoning on a wide range of ML models in
real-world conditions, performing quantitative and qualitative analyses. This
paper analyzes the main characteristics for each approach including performance
success metrics, required hyperparameters, and deployment complexity. Moreover,
this paper emphasizes the underlying assumptions and limitations considered by
both attackers and defenders along with their intrinsic properties such as:
availability, reliability, privacy, accountability, interpretability, etc.
Finally, this paper concludes by making references of some of main existing
research trends that provide pathways towards future research directions in the
field of cyber-security.
Related papers
- Towards the generation of hierarchical attack models from cybersecurity vulnerabilities using language models [3.7548609506798494]
This paper investigates the use of a pre-trained language model and siamese network to discern sibling relationships between text-based cybersecurity vulnerability data.
arXiv Detail & Related papers (2024-10-07T13:05:33Z) - A Survey and Evaluation of Adversarial Attacks for Object Detection [11.48212060875543]
Deep learning models excel in various computer vision tasks but are susceptible to adversarial examples-subtle perturbations in input data that lead to incorrect predictions.
This vulnerability poses significant risks in safety-critical applications such as autonomous vehicles, security surveillance, and aircraft health monitoring.
arXiv Detail & Related papers (2024-08-04T05:22:08Z) - Humanizing Machine-Generated Content: Evading AI-Text Detection through Adversarial Attack [24.954755569786396]
We propose a framework for a broader class of adversarial attacks, designed to perform minor perturbations in machine-generated content to evade detection.
We consider two attack settings: white-box and black-box, and employ adversarial learning in dynamic scenarios to assess the potential enhancement of the current detection model's robustness.
The empirical results reveal that the current detection models can be compromised in as little as 10 seconds, leading to the misclassification of machine-generated text as human-written content.
arXiv Detail & Related papers (2024-04-02T12:49:22Z) - Breaking Down the Defenses: A Comparative Survey of Attacks on Large Language Models [18.624280305864804]
Large Language Models (LLMs) have become a cornerstone in the field of Natural Language Processing (NLP)
This paper presents a comprehensive survey of the various forms of attacks targeting LLMs.
We delve into topics such as adversarial attacks that aim to manipulate model outputs, data poisoning that affects model training, and privacy concerns related to training data exploitation.
arXiv Detail & Related papers (2024-03-03T04:46:21Z) - Analyzing Adversarial Inputs in Deep Reinforcement Learning [53.3760591018817]
We present a comprehensive analysis of the characterization of adversarial inputs, through the lens of formal verification.
We introduce a novel metric, the Adversarial Rate, to classify models based on their susceptibility to such perturbations.
Our analysis empirically demonstrates how adversarial inputs can affect the safety of a given DRL system with respect to such perturbations.
arXiv Detail & Related papers (2024-02-07T21:58:40Z) - Assessing Privacy Risks in Language Models: A Case Study on
Summarization Tasks [65.21536453075275]
We focus on the summarization task and investigate the membership inference (MI) attack.
We exploit text similarity and the model's resistance to document modifications as potential MI signals.
We discuss several safeguards for training summarization models to protect against MI attacks and discuss the inherent trade-off between privacy and utility.
arXiv Detail & Related papers (2023-10-20T05:44:39Z) - It Is All About Data: A Survey on the Effects of Data on Adversarial
Robustness [4.1310970179750015]
Adversarial examples are inputs to machine learning models that an attacker has intentionally designed to confuse the model into making a mistake.
To address this problem, the area of adversarial robustness investigates mechanisms behind adversarial attacks and defenses against these attacks.
arXiv Detail & Related papers (2023-03-17T04:18:03Z) - Improving robustness of jet tagging algorithms with adversarial training [56.79800815519762]
We investigate the vulnerability of flavor tagging algorithms via application of adversarial attacks.
We present an adversarial training strategy that mitigates the impact of such simulated attacks.
arXiv Detail & Related papers (2022-03-25T19:57:19Z) - A Review of Adversarial Attack and Defense for Classification Methods [78.50824774203495]
This paper focuses on the generation and guarding of adversarial examples.
It is the hope of the authors that this paper will encourage more statisticians to work on this important and exciting field of generating and defending against adversarial examples.
arXiv Detail & Related papers (2021-11-18T22:13:43Z) - Explainable Adversarial Attacks in Deep Neural Networks Using Activation
Profiles [69.9674326582747]
This paper presents a visual framework to investigate neural network models subjected to adversarial examples.
We show how observing these elements can quickly pinpoint exploited areas in a model.
arXiv Detail & Related papers (2021-03-18T13:04:21Z) - ML-Doctor: Holistic Risk Assessment of Inference Attacks Against Machine
Learning Models [64.03398193325572]
Inference attacks against Machine Learning (ML) models allow adversaries to learn about training data, model parameters, etc.
We concentrate on four attacks - namely, membership inference, model inversion, attribute inference, and model stealing.
Our analysis relies on a modular re-usable software, ML-Doctor, which enables ML model owners to assess the risks of deploying their models.
arXiv Detail & Related papers (2021-02-04T11:35:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.