Attackers Can Do Better: Over- and Understated Factors of Model Stealing Attacks
- URL: http://arxiv.org/abs/2503.06188v1
- Date: Sat, 08 Mar 2025 12:18:52 GMT
- Title: Attackers Can Do Better: Over- and Understated Factors of Model Stealing Attacks
- Authors: Daryna Oliynyk, Rudolf Mayer, Andreas Rauber,
- Abstract summary: Substitute model training is an all-encompassing attack applicable to any machine learning model.<n>We examine the ambivalent influence of different factors resulting from varying the attacker's capabilities and knowledge on a substitute training attack.<n>Our results often exceed or match the performance of previous attacks that assume a stronger attacker.
- Score: 0.294944680995069
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Machine learning models were shown to be vulnerable to model stealing attacks, which lead to intellectual property infringement. Among other methods, substitute model training is an all-encompassing attack applicable to any machine learning model whose behaviour can be approximated from input-output queries. Whereas prior works mainly focused on improving the performance of substitute models by, e.g. developing a new substitute training method, there have been only limited ablation studies on the impact the attacker's strength has on the substitute model's performance. As a result, different authors came to diverse, sometimes contradicting, conclusions. In this work, we exhaustively examine the ambivalent influence of different factors resulting from varying the attacker's capabilities and knowledge on a substitute training attack. Our findings suggest that some of the factors that have been considered important in the past are, in fact, not that influential; instead, we discover new correlations between attack conditions and success rate. In particular, we demonstrate that better-performing target models enable higher-fidelity attacks and explain the intuition behind this phenomenon. Further, we propose to shift the focus from the complexity of target models toward the complexity of their learning tasks. Therefore, for the substitute model, rather than aiming for a higher architecture complexity, we suggest focusing on getting data of higher complexity and an appropriate architecture. Finally, we demonstrate that even in the most limited data-free scenario, there is no need to overcompensate weak knowledge with millions of queries. Our results often exceed or match the performance of previous attacks that assume a stronger attacker, suggesting that these stronger attacks are likely endangering a model owner's intellectual property to a significantly higher degree than shown until now.
Related papers
- Adversarial Training for Defense Against Label Poisoning Attacks [53.893792844055106]
Label poisoning attacks pose significant risks to machine learning models.<n>We propose a novel adversarial training defense strategy based on support vector machines (SVMs) to counter these threats.<n>Our approach accommodates various model architectures and employs a projected gradient descent algorithm with kernel SVMs for adversarial training.
arXiv Detail & Related papers (2025-02-24T13:03:19Z) - Efficient Data-Free Model Stealing with Label Diversity [22.8804507954023]
Machine learning as a Service (ML) allows users to query the machine learning model in an API manner, which provides an opportunity for users to enjoy the benefits brought by the high-performance model trained on valuable data.
This interface boosts the proliferation of machine learning based applications, while on the other hand, it introduces the attack surface for model stealing attacks.
Existing model stealing attacks have relaxed their attack assumptions to the data-free setting, while keeping the effectiveness.
In this paper, we revisit the model stealing problem from a diversity perspective and demonstrate that keeping the generated data samples more diverse across all the classes is the critical point
arXiv Detail & Related papers (2024-03-29T18:52:33Z) - Defense Against Model Extraction Attacks on Recommender Systems [53.127820987326295]
We introduce Gradient-based Ranking Optimization (GRO) to defend against model extraction attacks on recommender systems.
GRO aims to minimize the loss of the protected target model while maximizing the loss of the attacker's surrogate model.
Results show GRO's superior effectiveness in defending against model extraction attacks.
arXiv Detail & Related papers (2023-10-25T03:30:42Z) - The Space of Adversarial Strategies [6.295859509997257]
Adversarial examples, inputs designed to induce worst-case behavior in machine learning models, have been extensively studied over the past decade.
We propose a systematic approach to characterize worst-case (i.e., optimal) adversaries.
arXiv Detail & Related papers (2022-09-09T20:53:11Z) - Membership-Doctor: Comprehensive Assessment of Membership Inference
Against Machine Learning Models [11.842337448801066]
We present a large-scale measurement of different membership inference attacks and defenses.
We find that some assumptions of the threat model, such as same-architecture and same-distribution between shadow and target models, are unnecessary.
We are also the first to execute attacks on the real-world data collected from the Internet, instead of laboratory datasets.
arXiv Detail & Related papers (2022-08-22T17:00:53Z) - Recent improvements of ASR models in the face of adversarial attacks [28.934863462633636]
Speech Recognition models are vulnerable to adversarial attacks.
We show that the relative strengths of different attack algorithms vary considerably when changing the model architecture.
We release our source code as a package that should help future research in evaluating their attacks and defenses.
arXiv Detail & Related papers (2022-03-29T22:40:37Z) - Improving robustness of jet tagging algorithms with adversarial training [56.79800815519762]
We investigate the vulnerability of flavor tagging algorithms via application of adversarial attacks.
We present an adversarial training strategy that mitigates the impact of such simulated attacks.
arXiv Detail & Related papers (2022-03-25T19:57:19Z) - Delving into Data: Effectively Substitute Training for Black-box Attack [84.85798059317963]
We propose a novel perspective substitute training that focuses on designing the distribution of data used in the knowledge stealing process.
The combination of these two modules can further boost the consistency of the substitute model and target model, which greatly improves the effectiveness of adversarial attack.
arXiv Detail & Related papers (2021-04-26T07:26:29Z) - ML-Doctor: Holistic Risk Assessment of Inference Attacks Against Machine
Learning Models [64.03398193325572]
Inference attacks against Machine Learning (ML) models allow adversaries to learn about training data, model parameters, etc.
We concentrate on four attacks - namely, membership inference, model inversion, attribute inference, and model stealing.
Our analysis relies on a modular re-usable software, ML-Doctor, which enables ML model owners to assess the risks of deploying their models.
arXiv Detail & Related papers (2021-02-04T11:35:13Z) - Learning to Attack: Towards Textual Adversarial Attacking in Real-world
Situations [81.82518920087175]
Adversarial attacking aims to fool deep neural networks with adversarial examples.
We propose a reinforcement learning based attack model, which can learn from attack history and launch attacks more efficiently.
arXiv Detail & Related papers (2020-09-19T09:12:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.