Attacks on Online Learners: a Teacher-Student Analysis
- URL: http://arxiv.org/abs/2305.11132v2
- Date: Sun, 29 Oct 2023 12:44:49 GMT
- Title: Attacks on Online Learners: a Teacher-Student Analysis
- Authors: Riccardo Giuseppe Margiotta, Sebastian Goldt, Guido Sanguinetti
- Abstract summary: We study the case of adversarial attacks on machine learning models in an online learning setting.
We prove that a discontinuous transition in the learner's accuracy occurs when the attack strength exceeds a critical threshold.
Our findings show that greedy attacks can be extremely efficient, especially when data stream in small batches.
- Score: 8.567831574941252
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Machine learning models are famously vulnerable to adversarial attacks: small
ad-hoc perturbations of the data that can catastrophically alter the model
predictions. While a large literature has studied the case of test-time attacks
on pre-trained models, the important case of attacks in an online learning
setting has received little attention so far. In this work, we use a
control-theoretical perspective to study the scenario where an attacker may
perturb data labels to manipulate the learning dynamics of an online learner.
We perform a theoretical analysis of the problem in a teacher-student setup,
considering different attack strategies, and obtaining analytical results for
the steady state of simple linear learners. These results enable us to prove
that a discontinuous transition in the learner's accuracy occurs when the
attack strength exceeds a critical threshold. We then study empirically attacks
on learners with complex architectures using real data, confirming the insights
of our theoretical analysis. Our findings show that greedy attacks can be
extremely efficient, especially when data stream in small batches.
Related papers
- Understanding Data Importance in Machine Learning Attacks: Does Valuable Data Pose Greater Harm? [23.2883929808036]
We investigate the relationship between data importance and machine learning attacks by analyzing five distinct attack types.
For example, we observe that high importance data samples exhibit increased vulnerability in certain attacks, such as membership inference and model stealing.
These findings emphasize the urgent need for innovative defense mechanisms that strike a balance between maximizing utility and safeguarding valuable data.
arXiv Detail & Related papers (2024-09-05T17:54:26Z) - Analyzing the Impact of Adversarial Examples on Explainable Machine
Learning [0.31498833540989407]
Adversarial attacks are a type of attack on machine learning models where an attacker deliberately modifies the inputs to cause the model to make incorrect predictions.
Work on the vulnerability of deep learning models to adversarial attacks has shown that it is very easy to make samples that make a model predict things that it doesn't want to.
In this work, we analyze the impact of model interpretability due to adversarial attacks on text classification problems.
arXiv Detail & Related papers (2023-07-17T08:50:36Z) - Improving robustness of jet tagging algorithms with adversarial training [56.79800815519762]
We investigate the vulnerability of flavor tagging algorithms via application of adversarial attacks.
We present an adversarial training strategy that mitigates the impact of such simulated attacks.
arXiv Detail & Related papers (2022-03-25T19:57:19Z) - Interpretable and Effective Reinforcement Learning for Attacking against
Graph-based Rumor Detection [12.726403718158082]
Social networks are polluted by rumors, which can be detected by machine learning models.
Certain vulnerabilities are due to dependencies on the graphs and suspiciousness ranking.
With a black-box detector, we design features capturing the dependencies to allow a reinforcement learning to learn an effective and interpretable attack policy.
arXiv Detail & Related papers (2022-01-15T10:06:29Z) - Learning to Learn Transferable Attack [77.67399621530052]
Transfer adversarial attack is a non-trivial black-box adversarial attack that aims to craft adversarial perturbations on the surrogate model and then apply such perturbations to the victim model.
We propose a Learning to Learn Transferable Attack (LLTA) method, which makes the adversarial perturbations more generalized via learning from both data and model augmentation.
Empirical results on the widely-used dataset demonstrate the effectiveness of our attack method with a 12.85% higher success rate of transfer attack compared with the state-of-the-art methods.
arXiv Detail & Related papers (2021-12-10T07:24:21Z) - Enhanced Membership Inference Attacks against Machine Learning Models [9.26208227402571]
Membership inference attacks are used to quantify the private information that a model leaks about the individual data points in its training set.
We derive new attack algorithms that can achieve a high AUC score while also highlighting the different factors that affect their performance.
Our algorithms capture a very precise approximation of privacy loss in models, and can be used as a tool to perform an accurate and informed estimation of privacy risk in machine learning models.
arXiv Detail & Related papers (2021-11-18T13:31:22Z) - Learning and Certification under Instance-targeted Poisoning [49.55596073963654]
We study PAC learnability and certification under instance-targeted poisoning attacks.
We show that when the budget of the adversary scales sublinearly with the sample complexity, PAC learnability and certification are achievable.
We empirically study the robustness of K nearest neighbour, logistic regression, multi-layer perceptron, and convolutional neural network on real data sets.
arXiv Detail & Related papers (2021-05-18T17:48:15Z) - Explainable Adversarial Attacks in Deep Neural Networks Using Activation
Profiles [69.9674326582747]
This paper presents a visual framework to investigate neural network models subjected to adversarial examples.
We show how observing these elements can quickly pinpoint exploited areas in a model.
arXiv Detail & Related papers (2021-03-18T13:04:21Z) - ML-Doctor: Holistic Risk Assessment of Inference Attacks Against Machine
Learning Models [64.03398193325572]
Inference attacks against Machine Learning (ML) models allow adversaries to learn about training data, model parameters, etc.
We concentrate on four attacks - namely, membership inference, model inversion, attribute inference, and model stealing.
Our analysis relies on a modular re-usable software, ML-Doctor, which enables ML model owners to assess the risks of deploying their models.
arXiv Detail & Related papers (2021-02-04T11:35:13Z) - Learning to Attack: Towards Textual Adversarial Attacking in Real-world
Situations [81.82518920087175]
Adversarial attacking aims to fool deep neural networks with adversarial examples.
We propose a reinforcement learning based attack model, which can learn from attack history and launch attacks more efficiently.
arXiv Detail & Related papers (2020-09-19T09:12:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.