Classification under strategic adversary manipulation using pessimistic bilevel optimisation
- URL: http://arxiv.org/abs/2410.20284v1
- Date: Sat, 26 Oct 2024 22:27:21 GMT
- Title: Classification under strategic adversary manipulation using pessimistic bilevel optimisation
- Authors: David Benfield, Stefano Coniglio, Martin Kunc, Phan Tu Vuong, Alain Zemkoho,
- Abstract summary: Adversarial machine learning concerns situations in which learners face attacks from active adversaries.
Such scenarios arise in applications such as spam email filtering, malware detection and fake-image generation.
We model these interactions between the learner and the adversary as a game and formulate the problem as a pessimistic bilevel optimisation problem.
- Score: 2.6505619784178047
- License:
- Abstract: Adversarial machine learning concerns situations in which learners face attacks from active adversaries. Such scenarios arise in applications such as spam email filtering, malware detection and fake-image generation, where security methods must be actively updated to keep up with the ever improving generation of malicious data.We model these interactions between the learner and the adversary as a game and formulate the problem as a pessimistic bilevel optimisation problem with the learner taking the role of the leader. The adversary, modelled as a stochastic data generator, takes the role of the follower, generating data in response to the classifier. While existing models rely on the assumption that the adversary will choose the least costly solution leading to a convex lower-level problem with a unique solution, we present a novel model and solution method which do not make such assumptions. We compare these to the existing approach and see significant improvements in performance suggesting that relaxing these assumptions leads to a more realistic model.
Related papers
- Deep autoregressive density nets vs neural ensembles for model-based
offline reinforcement learning [2.9158689853305693]
We consider a model-based reinforcement learning algorithm that infers the system dynamics from the available data and performs policy optimization on imaginary model rollouts.
This approach is vulnerable to exploiting model errors which can lead to catastrophic failures on the real system.
We show that better performance can be obtained with a single well-calibrated autoregressive model on the D4RL benchmark.
arXiv Detail & Related papers (2024-02-05T10:18:15Z) - SA-Attack: Improving Adversarial Transferability of Vision-Language
Pre-training Models via Self-Augmentation [56.622250514119294]
In contrast to white-box adversarial attacks, transfer attacks are more reflective of real-world scenarios.
We propose a self-augment-based transfer attack method, termed SA-Attack.
arXiv Detail & Related papers (2023-12-08T09:08:50Z) - Segue: Side-information Guided Generative Unlearnable Examples for
Facial Privacy Protection in Real World [64.4289385463226]
We propose Segue: Side-information guided generative unlearnable examples.
To improve transferability, we introduce side information such as true labels and pseudo labels.
It can resist JPEG compression, adversarial training, and some standard data augmentations.
arXiv Detail & Related papers (2023-10-24T06:22:37Z) - Introducing Foundation Models as Surrogate Models: Advancing Towards
More Practical Adversarial Attacks [15.882687207499373]
No-box adversarial attacks are becoming more practical and challenging for AI systems.
This paper recasts adversarial attack as a downstream task by introducing foundational models as surrogate models.
arXiv Detail & Related papers (2023-07-13T08:10:48Z) - Improving robustness of jet tagging algorithms with adversarial training [56.79800815519762]
We investigate the vulnerability of flavor tagging algorithms via application of adversarial attacks.
We present an adversarial training strategy that mitigates the impact of such simulated attacks.
arXiv Detail & Related papers (2022-03-25T19:57:19Z) - Learning to Learn Transferable Attack [77.67399621530052]
Transfer adversarial attack is a non-trivial black-box adversarial attack that aims to craft adversarial perturbations on the surrogate model and then apply such perturbations to the victim model.
We propose a Learning to Learn Transferable Attack (LLTA) method, which makes the adversarial perturbations more generalized via learning from both data and model augmentation.
Empirical results on the widely-used dataset demonstrate the effectiveness of our attack method with a 12.85% higher success rate of transfer attack compared with the state-of-the-art methods.
arXiv Detail & Related papers (2021-12-10T07:24:21Z) - Graceful Degradation and Related Fields [0.0]
graceful degradation refers to the optimisation of model performance as it encounters out-of-distribution data.
This work presents a definition and discussion of graceful degradation and where it can be applied in deployed visual systems.
arXiv Detail & Related papers (2021-06-21T13:56:41Z) - Social NCE: Contrastive Learning of Socially-aware Motion
Representations [87.82126838588279]
Experimental results show that the proposed method dramatically reduces the collision rates of recent trajectory forecasting, behavioral cloning and reinforcement learning algorithms.
Our method makes few assumptions about neural architecture designs, and hence can be used as a generic way to promote the robustness of neural motion models.
arXiv Detail & Related papers (2020-12-21T22:25:06Z) - Linear Classifiers that Encourage Constructive Adaptation [6.324366770332667]
We study the dynamics of prediction and adaptation as a two-stage game, and characterize optimal strategies for the model designer and its decision subjects.
In benchmarks on simulated and real-world datasets, we find that classifiers trained using our method maintain the accuracy of existing approaches while inducing higher levels of improvement and less manipulation.
arXiv Detail & Related papers (2020-10-31T20:35:32Z) - Query-Free Adversarial Transfer via Undertrained Surrogates [14.112444998191698]
We introduce a new method for improving the efficacy of adversarial attacks in a black-box setting by undertraining the surrogate model which the attacks are generated on.
We show that this method transfers well across architectures and outperforms state-of-the-art methods by a wide margin.
arXiv Detail & Related papers (2020-07-01T23:12:22Z) - Learning Diverse Representations for Fast Adaptation to Distribution
Shift [78.83747601814669]
We present a method for learning multiple models, incorporating an objective that pressures each to learn a distinct way to solve the task.
We demonstrate our framework's ability to facilitate rapid adaptation to distribution shift.
arXiv Detail & Related papers (2020-06-12T12:23:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.