Towards Scalable and Robust Model Versioning
- URL: http://arxiv.org/abs/2401.09574v2
- Date: Mon, 11 Mar 2024 00:50:45 GMT
- Title: Towards Scalable and Robust Model Versioning
- Authors: Wenxin Ding, Arjun Nitin Bhagoji, Ben Y. Zhao, Haitao Zheng
- Abstract summary: Malicious incursions aimed at gaining access to deep learning models are on the rise.
We show how to generate multiple versions of a model that possess different attack properties.
We show theoretically that this can be accomplished by incorporating parameterized hidden distributions into the model training data.
- Score: 30.249607205048125
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: As the deployment of deep learning models continues to expand across
industries, the threat of malicious incursions aimed at gaining access to these
deployed models is on the rise. Should an attacker gain access to a deployed
model, whether through server breaches, insider attacks, or model inversion
techniques, they can then construct white-box adversarial attacks to manipulate
the model's classification outcomes, thereby posing significant risks to
organizations that rely on these models for critical tasks. Model owners need
mechanisms to protect themselves against such losses without the necessity of
acquiring fresh training data - a process that typically demands substantial
investments in time and capital.
In this paper, we explore the feasibility of generating multiple versions of
a model that possess different attack properties, without acquiring new
training data or changing model architecture. The model owner can deploy one
version at a time and replace a leaked version immediately with a new version.
The newly deployed model version can resist adversarial attacks generated
leveraging white-box access to one or all previously leaked versions. We show
theoretically that this can be accomplished by incorporating parameterized
hidden distributions into the model training data, forcing the model to learn
task-irrelevant features uniquely defined by the chosen data. Additionally,
optimal choices of hidden distributions can produce a sequence of model
versions capable of resisting compound transferability attacks over time.
Leveraging our analytical insights, we design and implement a practical model
versioning method for DNN classifiers, which leads to significant robustness
improvements over existing methods. We believe our work presents a promising
direction for safeguarding DNN services beyond their initial deployment.
Related papers
- Mitigating Downstream Model Risks via Model Provenance [30.083382916838623]
We propose a machine-readable model specification format to simplify the creation of model records.
Our solution explicitly traces relationships between upstream and downstream models, enhancing transparency and traceability.
This proof of concept aims to set a new standard for managing foundation models, bridging the gap between innovation and responsible model management.
arXiv Detail & Related papers (2024-10-03T05:52:15Z) - Multi-Model based Federated Learning Against Model Poisoning Attack: A Deep Learning Based Model Selection for MEC Systems [11.564289367348334]
Federated Learning (FL) enables training of a global model from distributed data, while preserving data privacy.
This paper proposes a multi-model based FL as a proactive mechanism to enhance the opportunity of model poisoning attack mitigation.
For a DDoS attack detection scenario, results illustrate a competitive accuracy gain under poisoning attack with the scenario that the system is without attack, and also a potential of recognition time improvement.
arXiv Detail & Related papers (2024-09-12T17:36:26Z) - Privacy Backdoors: Enhancing Membership Inference through Poisoning Pre-trained Models [112.48136829374741]
In this paper, we unveil a new vulnerability: the privacy backdoor attack.
When a victim fine-tunes a backdoored model, their training data will be leaked at a significantly higher rate than if they had fine-tuned a typical model.
Our findings highlight a critical privacy concern within the machine learning community and call for a reevaluation of safety protocols in the use of open-source pre-trained models.
arXiv Detail & Related papers (2024-04-01T16:50:54Z) - Fantastic Gains and Where to Find Them: On the Existence and Prospect of
General Knowledge Transfer between Any Pretrained Model [74.62272538148245]
We show that for arbitrary pairings of pretrained models, one model extracts significant data context unavailable in the other.
We investigate if it is possible to transfer such "complementary" knowledge from one model to another without performance degradation.
arXiv Detail & Related papers (2023-10-26T17:59:46Z) - Self-Destructing Models: Increasing the Costs of Harmful Dual Uses of
Foundation Models [103.71308117592963]
We present an algorithm for training self-destructing models leveraging techniques from meta-learning and adversarial learning.
In a small-scale experiment, we show MLAC can largely prevent a BERT-style model from being re-purposed to perform gender identification.
arXiv Detail & Related papers (2022-11-27T21:43:45Z) - MOVE: Effective and Harmless Ownership Verification via Embedded
External Features [109.19238806106426]
We propose an effective and harmless model ownership verification (MOVE) to defend against different types of model stealing simultaneously.
We conduct the ownership verification by verifying whether a suspicious model contains the knowledge of defender-specified external features.
In particular, we develop our MOVE method under both white-box and black-box settings to provide comprehensive model protection.
arXiv Detail & Related papers (2022-08-04T02:22:29Z) - Careful What You Wish For: on the Extraction of Adversarially Trained
Models [2.707154152696381]
Recent attacks on Machine Learning (ML) models pose several security and privacy threats.
We propose a framework to assess extraction attacks on adversarially trained models.
We show that adversarially trained models are more vulnerable to extraction attacks than models obtained under natural training circumstances.
arXiv Detail & Related papers (2022-07-21T16:04:37Z) - DST: Dynamic Substitute Training for Data-free Black-box Attack [79.61601742693713]
We propose a novel dynamic substitute training attack method to encourage substitute model to learn better and faster from the target model.
We introduce a task-driven graph-based structure information learning constrain to improve the quality of generated training data.
arXiv Detail & Related papers (2022-04-03T02:29:11Z) - Thief, Beware of What Get You There: Towards Understanding Model
Extraction Attack [13.28881502612207]
In some scenarios, AI models are trained proprietarily, where neither pre-trained models nor sufficient in-distribution data is publicly available.
We find the effectiveness of existing techniques significantly affected by the absence of pre-trained models.
We formulate model extraction attacks into an adaptive framework that captures these factors with deep reinforcement learning.
arXiv Detail & Related papers (2021-04-13T03:46:59Z) - DaST: Data-free Substitute Training for Adversarial Attacks [55.76371274622313]
We propose a data-free substitute training method (DaST) to obtain substitute models for adversarial black-box attacks.
To achieve this, DaST utilizes specially designed generative adversarial networks (GANs) to train the substitute models.
Experiments demonstrate the substitute models can achieve competitive performance compared with the baseline models.
arXiv Detail & Related papers (2020-03-28T04:28:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.