Investigating White-Box Attacks for On-Device Models
- URL: http://arxiv.org/abs/2402.05493v4
- Date: Fri, 1 Mar 2024 05:22:38 GMT
- Title: Investigating White-Box Attacks for On-Device Models
- Authors: Mingyi Zhou, Xiang Gao, Jing Wu, Kui Liu, Hailong Sun, Li Li
- Abstract summary: On-device models are vulnerable to attacks as they can be easily extracted from their corresponding mobile apps.
We propose a Reverse Engineering framework for On-device Models (REOM), which automatically reverses the compiled on-device TFLite model to the debuggable model.
Our results show that REOM enables attackers to achieve higher attack success rates with a hundred times smaller attack perturbations.
- Score: 21.329209501209665
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Numerous mobile apps have leveraged deep learning capabilities. However,
on-device models are vulnerable to attacks as they can be easily extracted from
their corresponding mobile apps. Existing on-device attacking approaches only
generate black-box attacks, which are far less effective and efficient than
white-box strategies. This is because mobile deep learning frameworks like
TFLite do not support gradient computing, which is necessary for white-box
attacking algorithms. Thus, we argue that existing findings may underestimate
the harmfulness of on-device attacks. To this end, we conduct a study to answer
this research question: Can on-device models be directly attacked via white-box
strategies? We first systematically analyze the difficulties of transforming
the on-device model to its debuggable version, and propose a Reverse
Engineering framework for On-device Models (REOM), which automatically reverses
the compiled on-device TFLite model to the debuggable model. Specifically, REOM
first transforms compiled on-device models into Open Neural Network Exchange
format, then removes the non-debuggable parts, and converts them to the
debuggable DL models format that allows attackers to exploit in a white-box
setting. Our experimental results show that our approach is effective in
achieving automated transformation among 244 TFLite models. Compared with
previous attacks using surrogate models, REOM enables attackers to achieve
higher attack success rates with a hundred times smaller attack perturbations.
In addition, because the ONNX platform has plenty of tools for model format
exchanging, the proposed method based on the ONNX platform can be adapted to
other model formats. Our findings emphasize the need for developers to
carefully consider their model deployment strategies, and use white-box methods
to evaluate the vulnerability of on-device models.
Related papers
- A Realistic Threat Model for Large Language Model Jailbreaks [87.64278063236847]
In this work, we propose a unified threat model for the principled comparison of jailbreak attacks.
Our threat model combines constraints in perplexity, measuring how far a jailbreak deviates from natural text.
We adapt popular attacks to this new, realistic threat model, with which we, for the first time, benchmark these attacks on equal footing.
arXiv Detail & Related papers (2024-10-21T17:27:01Z) - DynaMO: Protecting Mobile DL Models through Coupling Obfuscated DL Operators [29.82616462226066]
Attackers can easily reverse-engineer mobile DL models in Apps to steal intellectual property or generate effective attacks.
Model Obfuscation has been proposed to defend against such reverse engineering.
We propose DynaMO, a Dynamic Model Obfuscation strategy similar to Homomorphic Encryption.
arXiv Detail & Related papers (2024-10-19T08:30:08Z) - Defense Against Model Extraction Attacks on Recommender Systems [53.127820987326295]
We introduce Gradient-based Ranking Optimization (GRO) to defend against model extraction attacks on recommender systems.
GRO aims to minimize the loss of the protected target model while maximizing the loss of the attacker's surrogate model.
Results show GRO's superior effectiveness in defending against model extraction attacks.
arXiv Detail & Related papers (2023-10-25T03:30:42Z) - ModelObfuscator: Obfuscating Model Information to Protect Deployed ML-based Systems [31.988501084337678]
We develop a prototype tool ModelObfuscator to automatically obfuscate on-device TFLite models.
Our experiments show that this proposed approach can dramatically improve model security.
arXiv Detail & Related papers (2023-06-01T05:24:00Z) - Query Efficient Cross-Dataset Transferable Black-Box Attack on Action
Recognition [99.29804193431823]
Black-box adversarial attacks present a realistic threat to action recognition systems.
We propose a new attack on action recognition that addresses these shortcomings by generating perturbations.
Our method achieves 8% and higher 12% deception rates compared to state-of-the-art query-based and transfer-based attacks.
arXiv Detail & Related papers (2022-11-23T17:47:49Z) - Scoring Black-Box Models for Adversarial Robustness [4.416484585765028]
robustness of models to adversarial attacks has been analyzed.
We propose a simple scoring method for black-box models which indicates their robustness to adversarial input.
arXiv Detail & Related papers (2022-10-31T08:41:44Z) - Smart App Attack: Hacking Deep Learning Models in Android Apps [16.663345577900813]
We introduce a grey-box adversarial attack framework to hack on-device models.
We evaluate the attack effectiveness and generality in terms of four different settings.
Among 53 apps adopting transfer learning, we find that 71.7% of them can be successfully attacked.
arXiv Detail & Related papers (2022-04-23T14:01:59Z) - How to Robustify Black-Box ML Models? A Zeroth-Order Optimization
Perspective [74.47093382436823]
We address the problem of black-box defense: How to robustify a black-box model using just input queries and output feedback?
We propose a general notion of defensive operation that can be applied to black-box models, and design it through the lens of denoised smoothing (DS)
We empirically show that ZO-AE-DS can achieve improved accuracy, certified robustness, and query complexity over existing baselines.
arXiv Detail & Related papers (2022-03-27T03:23:32Z) - Query Efficient Decision Based Sparse Attacks Against Black-Box Deep
Learning Models [9.93052896330371]
We develop an evolution-based algorithm-SparseEvo-for the problem and evaluate against both convolutional deep neural networks and vision transformers.
SparseEvo requires significantly fewer model queries than the state-of-the-art sparse attack Pointwise for both untargeted and targeted attacks.
Importantly, the query efficient SparseEvo, along with decision-based attacks, in general raise new questions regarding the safety of deployed systems.
arXiv Detail & Related papers (2022-01-31T21:10:47Z) - Meta Gradient Adversarial Attack [64.5070788261061]
This paper proposes a novel architecture called Metaversa Gradient Adrial Attack (MGAA), which is plug-and-play and can be integrated with any existing gradient-based attack method.
Specifically, we randomly sample multiple models from a model zoo to compose different tasks and iteratively simulate a white-box attack and a black-box attack in each task.
By narrowing the gap between the gradient directions in white-box and black-box attacks, the transferability of adversarial examples on the black-box setting can be improved.
arXiv Detail & Related papers (2021-08-09T17:44:19Z) - Boosting Black-Box Attack with Partially Transferred Conditional
Adversarial Distribution [83.02632136860976]
We study black-box adversarial attacks against deep neural networks (DNNs)
We develop a novel mechanism of adversarial transferability, which is robust to the surrogate biases.
Experiments on benchmark datasets and attacking against real-world API demonstrate the superior attack performance of the proposed method.
arXiv Detail & Related papers (2020-06-15T16:45:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.