Mind Your Weight(s): A Large-scale Study on Insufficient Machine
Learning Model Protection in Mobile Apps
- URL: http://arxiv.org/abs/2002.07687v2
- Date: Mon, 14 Jun 2021 22:35:34 GMT
- Title: Mind Your Weight(s): A Large-scale Study on Insufficient Machine
Learning Model Protection in Mobile Apps
- Authors: Zhichuang Sun, Ruimin Sun, Long Lu, Alan Mislove
- Abstract summary: This paper presents the first empirical study of machine learning model protection on mobile devices.
We analyzed 46,753 popular apps collected from the US and Chinese app markets.
We found that, alarmingly, 41% of ML apps do not protect their models at all, which can be trivially stolen from app packages.
- Score: 17.421303987300902
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: On-device machine learning (ML) is quickly gaining popularity among mobile
apps. It allows offline model inference while preserving user privacy. However,
ML models, considered as core intellectual properties of model owners, are now
stored on billions of untrusted devices and subject to potential thefts. Leaked
models can cause both severe financial loss and security consequences. This
paper presents the first empirical study of ML model protection on mobile
devices. Our study aims to answer three open questions with quantitative
evidence: How widely is model protection used in apps? How robust are existing
model protection techniques? What impacts can (stolen) models incur? To that
end, we built a simple app analysis pipeline and analyzed 46,753 popular apps
collected from the US and Chinese app markets. We identified 1,468 ML apps
spanning all popular app categories. We found that, alarmingly, 41% of ML apps
do not protect their models at all, which can be trivially stolen from app
packages. Even for those apps that use model protection or encryption, we were
able to extract the models from 66% of them via unsophisticated dynamic
analysis techniques. The extracted models are mostly commercial products and
used for face recognition, liveness detection, ID/bank card recognition, and
malware detection. We quantitatively estimated the potential financial and
security impact of a leaked model, which can amount to millions of dollars for
different stakeholders. Our study reveals that on-device models are currently
at high risk of being leaked; attackers are highly motivated to steal such
models. Drawn from our large-scale study, we report our insights into this
emerging security problem and discuss the technical challenges, hoping to
inspire future research on robust and practical model protection for mobile
devices.
Related papers
- Privacy Backdoors: Enhancing Membership Inference through Poisoning Pre-trained Models [112.48136829374741]
In this paper, we unveil a new vulnerability: the privacy backdoor attack.
When a victim fine-tunes a backdoored model, their training data will be leaked at a significantly higher rate than if they had fine-tuned a typical model.
Our findings highlight a critical privacy concern within the machine learning community and call for a reevaluation of safety protocols in the use of open-source pre-trained models.
arXiv Detail & Related papers (2024-04-01T16:50:54Z) - MalModel: Hiding Malicious Payload in Mobile Deep Learning Models with Black-box Backdoor Attack [24.569156952823068]
We propose a method to generate or transform mobile malware by hiding the malicious payloads inside the parameters of deep learning models.
We can run malware in DL mobile applications covertly with little impact on the model performance.
arXiv Detail & Related papers (2024-01-05T06:35:24Z) - SecurityNet: Assessing Machine Learning Vulnerabilities on Public Models [74.58014281829946]
We analyze the effectiveness of several representative attacks/defenses, including model stealing attacks, membership inference attacks, and backdoor detection on public models.
Our evaluation empirically shows the performance of these attacks/defenses can vary significantly on public models compared to self-trained models.
arXiv Detail & Related papers (2023-10-19T11:49:22Z) - Beyond Labeling Oracles: What does it mean to steal ML models? [52.63413852460003]
Model extraction attacks are designed to steal trained models with only query access.
We investigate factors influencing the success of model extraction attacks.
Our findings urge the community to redefine the adversarial goals of ME attacks.
arXiv Detail & Related papers (2023-10-03T11:10:21Z) - Beyond the Model: Data Pre-processing Attack to Deep Learning Models in
Android Apps [3.2307366446033945]
We introduce a data processing-based attack against real-world deep learning (DL) apps.
Our attack could influence the performance and latency of the model without affecting the operation of a DL app.
Among 320 apps utilizing MLkit, we find that 81.56% of them can be successfully attacked.
arXiv Detail & Related papers (2023-05-06T07:35:39Z) - Publishing Efficient On-device Models Increases Adversarial
Vulnerability [58.6975494957865]
In this paper, we study the security considerations of publishing on-device variants of large-scale models.
We first show that an adversary can exploit on-device models to make attacking the large models easier.
We then show that the vulnerability increases as the similarity between a full-scale and its efficient model increase.
arXiv Detail & Related papers (2022-12-28T05:05:58Z) - MOVE: Effective and Harmless Ownership Verification via Embedded
External Features [109.19238806106426]
We propose an effective and harmless model ownership verification (MOVE) to defend against different types of model stealing simultaneously.
We conduct the ownership verification by verifying whether a suspicious model contains the knowledge of defender-specified external features.
In particular, we develop our MOVE method under both white-box and black-box settings to provide comprehensive model protection.
arXiv Detail & Related papers (2022-08-04T02:22:29Z) - Machine Learning Security against Data Poisoning: Are We There Yet? [23.809841593870757]
This article reviews data poisoning attacks that compromise the training data used to learn machine learning models.
We discuss how to mitigate these attacks using basic security principles, or by deploying ML-oriented defensive mechanisms.
arXiv Detail & Related papers (2022-04-12T17:52:09Z) - Defending against Model Stealing via Verifying Embedded External
Features [90.29429679125508]
adversaries can steal' deployed models even when they have no training samples and can not get access to the model parameters or structures.
We explore the defense from another angle by verifying whether a suspicious model contains the knowledge of defender-specified emphexternal features.
Our method is effective in detecting different types of model stealing simultaneously, even if the stolen model is obtained via a multi-stage stealing process.
arXiv Detail & Related papers (2021-12-07T03:51:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.