Related papers: Attacks on Online Learners: a Teacher-Student Analysis

Attacks on Online Learners: a Teacher-Student Analysis

URL: http://arxiv.org/abs/2305.11132v2
Date: Sun, 29 Oct 2023 12:44:49 GMT
Title: Attacks on Online Learners: a Teacher-Student Analysis
Authors: Riccardo Giuseppe Margiotta, Sebastian Goldt, Guido Sanguinetti
Abstract summary: We study the case of adversarial attacks on machine learning models in an online learning setting. We prove that a discontinuous transition in the learner's accuracy occurs when the attack strength exceeds a critical threshold. Our findings show that greedy attacks can be extremely efficient, especially when data stream in small batches.
Score: 8.567831574941252
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Machine learning models are famously vulnerable to adversarial attacks: small ad-hoc perturbations of the data that can catastrophically alter the model predictions. While a large literature has studied the case of test-time attacks on pre-trained models, the important case of attacks in an online learning setting has received little attention so far. In this work, we use a control-theoretical perspective to study the scenario where an attacker may perturb data labels to manipulate the learning dynamics of an online learner. We perform a theoretical analysis of the problem in a teacher-student setup, considering different attack strategies, and obtaining analytical results for the steady state of simple linear learners. These results enable us to prove that a discontinuous transition in the learner's accuracy occurs when the attack strength exceeds a critical threshold. We then study empirically attacks on learners with complex architectures using real data, confirming the insights of our theoretical analysis. Our findings show that greedy attacks can be extremely efficient, especially when data stream in small batches.

Related papers

Understanding Data Importance in Machine Learning Attacks: Does Valuable Data Pose Greater Harm? [23.2883929808036]
We investigate the relationship between data importance and machine learning attacks by analyzing five distinct attack types. For example, we observe that high importance data samples exhibit increased vulnerability in certain attacks, such as membership inference and model stealing. These findings emphasize the urgent need for innovative defense mechanisms that strike a balance between maximizing utility and safeguarding valuable data.
arXiv Detail & Related papers (2024-09-05T17:54:26Z)
Analyzing the Impact of Adversarial Examples on Explainable Machine Learning [0.31498833540989407]
Adversarial attacks are a type of attack on machine learning models where an attacker deliberately modifies the inputs to cause the model to make incorrect predictions. Work on the vulnerability of deep learning models to adversarial attacks has shown that it is very easy to make samples that make a model predict things that it doesn't want to. In this work, we analyze the impact of model interpretability due to adversarial attacks on text classification problems.
arXiv Detail & Related papers (2023-07-17T08:50:36Z)
Improving robustness of jet tagging algorithms with adversarial training [56.79800815519762]
We investigate the vulnerability of flavor tagging algorithms via application of adversarial attacks. We present an adversarial training strategy that mitigates the impact of such simulated attacks.
arXiv Detail & Related papers (2022-03-25T19:57:19Z)
Interpretable and Effective Reinforcement Learning for Attacking against Graph-based Rumor Detection [12.726403718158082]
Social networks are polluted by rumors, which can be detected by machine learning models. Certain vulnerabilities are due to dependencies on the graphs and suspiciousness ranking. With a black-box detector, we design features capturing the dependencies to allow a reinforcement learning to learn an effective and interpretable attack policy.
arXiv Detail & Related papers (2022-01-15T10:06:29Z)
Learning to Learn Transferable Attack [77.67399621530052]
Transfer adversarial attack is a non-trivial black-box adversarial attack that aims to craft adversarial perturbations on the surrogate model and then apply such perturbations to the victim model. We propose a Learning to Learn Transferable Attack (LLTA) method, which makes the adversarial perturbations more generalized via learning from both data and model augmentation. Empirical results on the widely-used dataset demonstrate the effectiveness of our attack method with a 12.85% higher success rate of transfer attack compared with the state-of-the-art methods.
arXiv Detail & Related papers (2021-12-10T07:24:21Z)
Enhanced Membership Inference Attacks against Machine Learning Models [9.26208227402571]
Membership inference attacks are used to quantify the private information that a model leaks about the individual data points in its training set. We derive new attack algorithms that can achieve a high AUC score while also highlighting the different factors that affect their performance. Our algorithms capture a very precise approximation of privacy loss in models, and can be used as a tool to perform an accurate and informed estimation of privacy risk in machine learning models.
arXiv Detail & Related papers (2021-11-18T13:31:22Z)
Learning and Certification under Instance-targeted Poisoning [49.55596073963654]
We study PAC learnability and certification under instance-targeted poisoning attacks. We show that when the budget of the adversary scales sublinearly with the sample complexity, PAC learnability and certification are achievable. We empirically study the robustness of K nearest neighbour, logistic regression, multi-layer perceptron, and convolutional neural network on real data sets.
arXiv Detail & Related papers (2021-05-18T17:48:15Z)
Explainable Adversarial Attacks in Deep Neural Networks Using Activation Profiles [69.9674326582747]
This paper presents a visual framework to investigate neural network models subjected to adversarial examples. We show how observing these elements can quickly pinpoint exploited areas in a model.
arXiv Detail & Related papers (2021-03-18T13:04:21Z)
ML-Doctor: Holistic Risk Assessment of Inference Attacks Against Machine Learning Models [64.03398193325572]
Inference attacks against Machine Learning (ML) models allow adversaries to learn about training data, model parameters, etc. We concentrate on four attacks - namely, membership inference, model inversion, attribute inference, and model stealing. Our analysis relies on a modular re-usable software, ML-Doctor, which enables ML model owners to assess the risks of deploying their models.
arXiv Detail & Related papers (2021-02-04T11:35:13Z)
Learning to Attack: Towards Textual Adversarial Attacking in Real-world Situations [81.82518920087175]
Adversarial attacking aims to fool deep neural networks with adversarial examples. We propose a reinforcement learning based attack model, which can learn from attack history and launch attacks more efficiently.
arXiv Detail & Related papers (2020-09-19T09:12:24Z)

This list is automatically generated from the titles and abstracts of the papers in this site.