PINCH: An Adversarial Extraction Attack Framework for Deep Learning
Models
- URL: http://arxiv.org/abs/2209.06300v1
- Date: Tue, 13 Sep 2022 21:08:13 GMT
- Title: PINCH: An Adversarial Extraction Attack Framework for Deep Learning
Models
- Authors: William Hackett, Stefan Trawicki, Zhengxin Yu, Neeraj Suri, Peter
Garraghan
- Abstract summary: Deep Learning (DL) models increasingly power a diversity of applications.
This paper presents PINCH: an efficient and automated extraction attack framework capable of deploying and evaluating multiple DL models and attacks across heterogeneous hardware platforms.
- Score: 3.884583419548512
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Deep Learning (DL) models increasingly power a diversity of applications.
Unfortunately, this pervasiveness also makes them attractive targets for
extraction attacks which can steal the architecture, parameters, and
hyper-parameters of a targeted DL model. Existing extraction attack studies
have observed varying levels of attack success for different DL models and
datasets, yet the underlying cause(s) behind their susceptibility often remain
unclear. Ascertaining such root-cause weaknesses would help facilitate secure
DL systems, though this requires studying extraction attacks in a wide variety
of scenarios to identify commonalities across attack success and DL
characteristics. The overwhelmingly high technical effort and time required to
understand, implement, and evaluate even a single attack makes it infeasible to
explore the large number of unique extraction attack scenarios in existence,
with current frameworks typically designed to only operate for specific attack
types, datasets and hardware platforms. In this paper we present PINCH: an
efficient and automated extraction attack framework capable of deploying and
evaluating multiple DL models and attacks across heterogeneous hardware
platforms. We demonstrate the effectiveness of PINCH by empirically evaluating
a large number of previously unexplored extraction attack scenarios, as well as
secondary attack staging. Our key findings show that 1) multiple
characteristics affect extraction attack success spanning DL model
architecture, dataset complexity, hardware, attack type, and 2) partially
successful extraction attacks significantly enhance the success of further
adversarial attack staging.
Related papers
- Large Language Models Merging for Enhancing the Link Stealing Attack on Graph Neural Networks [10.807912659961012]
Link stealing attacks on graph data pose a significant privacy threat.
We find that an attacker can combine the data knowledge of multiple attackers to create a more effective attack model.
We propose a novel link stealing attack method that takes advantage of cross-dataset and Large Language Models.
arXiv Detail & Related papers (2024-12-08T06:37:05Z) - Learning diverse attacks on large language models for robust red-teaming and safety tuning [126.32539952157083]
Red-teaming, or identifying prompts that elicit harmful responses, is a critical step in ensuring the safe deployment of large language models.
We show that even with explicit regularization to favor novelty and diversity, existing approaches suffer from mode collapse or fail to generate effective attacks.
We propose to use GFlowNet fine-tuning, followed by a secondary smoothing phase, to train the attacker model to generate diverse and effective attack prompts.
arXiv Detail & Related papers (2024-05-28T19:16:17Z) - From Attack to Defense: Insights into Deep Learning Security Measures in Black-Box Settings [1.8006345220416338]
adversarial samples pose a serious threat that can cause the model to misbehave and compromise the performance of such applications.
Addressing the robustness of Deep Learning models has become crucial to understanding and defending against adversarial attacks.
Our research focuses on black-box attacks such as SimBA, HopSkipJump, MGAAttack, and boundary attacks, as well as preprocessor-based defensive mechanisms.
arXiv Detail & Related papers (2024-05-03T09:40:47Z) - Meta Invariance Defense Towards Generalizable Robustness to Unknown Adversarial Attacks [62.036798488144306]
Current defense mainly focuses on the known attacks, but the adversarial robustness to the unknown attacks is seriously overlooked.
We propose an attack-agnostic defense method named Meta Invariance Defense (MID)
We show that MID simultaneously achieves robustness to the imperceptible adversarial perturbations in high-level image classification and attack-suppression in low-level robust image regeneration.
arXiv Detail & Related papers (2024-04-04T10:10:38Z) - Efficient Data-Free Model Stealing with Label Diversity [22.8804507954023]
Machine learning as a Service (ML) allows users to query the machine learning model in an API manner, which provides an opportunity for users to enjoy the benefits brought by the high-performance model trained on valuable data.
This interface boosts the proliferation of machine learning based applications, while on the other hand, it introduces the attack surface for model stealing attacks.
Existing model stealing attacks have relaxed their attack assumptions to the data-free setting, while keeping the effectiveness.
In this paper, we revisit the model stealing problem from a diversity perspective and demonstrate that keeping the generated data samples more diverse across all the classes is the critical point
arXiv Detail & Related papers (2024-03-29T18:52:33Z) - BadCLIP: Dual-Embedding Guided Backdoor Attack on Multimodal Contrastive
Learning [85.2564206440109]
This paper reveals the threats in this practical scenario that backdoor attacks can remain effective even after defenses.
We introduce the emphtoolns attack, which is resistant to backdoor detection and model fine-tuning defenses.
arXiv Detail & Related papers (2023-11-20T02:21:49Z) - Streamlining Attack Tree Generation: A Fragment-Based Approach [39.157069600312774]
We present a novel fragment-based attack graph generation approach that utilizes information from publicly available information security databases.
We also propose a domain-specific language for attack modeling, which we employ in the proposed attack graph generation approach.
arXiv Detail & Related papers (2023-10-01T12:41:38Z) - Downlink Power Allocation in Massive MIMO via Deep Learning: Adversarial
Attacks and Training [62.77129284830945]
This paper considers a regression problem in a wireless setting and shows that adversarial attacks can break the DL-based approach.
We also analyze the effectiveness of adversarial training as a defensive technique in adversarial settings and show that the robustness of DL-based wireless system against attacks improves significantly.
arXiv Detail & Related papers (2022-06-14T04:55:11Z) - ML-Doctor: Holistic Risk Assessment of Inference Attacks Against Machine
Learning Models [64.03398193325572]
Inference attacks against Machine Learning (ML) models allow adversaries to learn about training data, model parameters, etc.
We concentrate on four attacks - namely, membership inference, model inversion, attribute inference, and model stealing.
Our analysis relies on a modular re-usable software, ML-Doctor, which enables ML model owners to assess the risks of deploying their models.
arXiv Detail & Related papers (2021-02-04T11:35:13Z) - Learning to Attack: Towards Textual Adversarial Attacking in Real-world
Situations [81.82518920087175]
Adversarial attacking aims to fool deep neural networks with adversarial examples.
We propose a reinforcement learning based attack model, which can learn from attack history and launch attacks more efficiently.
arXiv Detail & Related papers (2020-09-19T09:12:24Z) - Adversarial Attacks for Multi-view Deep Models [39.07356013772198]
This paper proposes two multi-view attack strategies, two-stage attack (TSA) and end-to-end attack (ETEA)
The main idea of TSA is to attack the multi-view model with adversarial examples generated by attacking the associated single-view model.
The ETEA is applied to accomplish direct attacks on the target multi-view model.
arXiv Detail & Related papers (2020-06-19T08:07:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.