NNSplitter: An Active Defense Solution for DNN Model via Automated
Weight Obfuscation
- URL: http://arxiv.org/abs/2305.00097v3
- Date: Wed, 16 Aug 2023 21:25:10 GMT
- Title: NNSplitter: An Active Defense Solution for DNN Model via Automated
Weight Obfuscation
- Authors: Tong Zhou, Yukui Luo, Shaolei Ren, Xiaolin Xu
- Abstract summary: Deep neural network (DNN) models have been protected by techniques like watermarking.
In this work, we propose an active model IP protection scheme, namely NNSplitter.
NNSplitter actively protects the model by splitting it into two parts: the obfuscated model that performs poorly due to weight obfuscation, and the model secrets.
- Score: 26.12728348169104
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: As a type of valuable intellectual property (IP), deep neural network (DNN)
models have been protected by techniques like watermarking. However, such
passive model protection cannot fully prevent model abuse. In this work, we
propose an active model IP protection scheme, namely NNSplitter, which actively
protects the model by splitting it into two parts: the obfuscated model that
performs poorly due to weight obfuscation, and the model secrets consisting of
the indexes and original values of the obfuscated weights, which can only be
accessed by authorized users with the support of the trusted execution
environment. Experimental results demonstrate the effectiveness of NNSplitter,
e.g., by only modifying 275 out of over 11 million (i.e., 0.002%) weights, the
accuracy of the obfuscated ResNet-18 model on CIFAR-10 can drop to 10%.
Moreover, NNSplitter is stealthy and resilient against norm clipping and
fine-tuning attacks, making it an appealing solution for DNN model protection.
The code is available at: https://github.com/Tongzhou0101/NNSplitter.
Related papers
- Isolation and Induction: Training Robust Deep Neural Networks against
Model Stealing Attacks [51.51023951695014]
Existing model stealing defenses add deceptive perturbations to the victim's posterior probabilities to mislead the attackers.
This paper proposes Isolation and Induction (InI), a novel and effective training framework for model stealing defenses.
In contrast to adding perturbations over model predictions that harm the benign accuracy, we train models to produce uninformative outputs against stealing queries.
arXiv Detail & Related papers (2023-08-02T05:54:01Z) - Are aligned neural networks adversarially aligned? [93.91072860401856]
adversarial users can construct inputs which circumvent attempts at alignment.
We show that existing NLP-based optimization attacks are insufficiently powerful to reliably attack aligned text models.
We conjecture that improved NLP attacks may demonstrate this same level of adversarial control over text-only models.
arXiv Detail & Related papers (2023-06-26T17:18:44Z) - Publishing Efficient On-device Models Increases Adversarial
Vulnerability [58.6975494957865]
In this paper, we study the security considerations of publishing on-device variants of large-scale models.
We first show that an adversary can exploit on-device models to make attacking the large models easier.
We then show that the vulnerability increases as the similarity between a full-scale and its efficient model increase.
arXiv Detail & Related papers (2022-12-28T05:05:58Z) - MOVE: Effective and Harmless Ownership Verification via Embedded
External Features [109.19238806106426]
We propose an effective and harmless model ownership verification (MOVE) to defend against different types of model stealing simultaneously.
We conduct the ownership verification by verifying whether a suspicious model contains the knowledge of defender-specified external features.
In particular, we develop our MOVE method under both white-box and black-box settings to provide comprehensive model protection.
arXiv Detail & Related papers (2022-08-04T02:22:29Z) - PCPT and ACPT: Copyright Protection and Traceability Scheme for DNN
Models [13.043683635373213]
Deep neural networks (DNNs) have achieved tremendous success in artificial intelligence (AI) fields.
DNN models can be easily illegally copied, redistributed, or abused by criminals.
arXiv Detail & Related papers (2022-06-06T12:12:47Z) - How to Robustify Black-Box ML Models? A Zeroth-Order Optimization
Perspective [74.47093382436823]
We address the problem of black-box defense: How to robustify a black-box model using just input queries and output feedback?
We propose a general notion of defensive operation that can be applied to black-box models, and design it through the lens of denoised smoothing (DS)
We empirically show that ZO-AE-DS can achieve improved accuracy, certified robustness, and query complexity over existing baselines.
arXiv Detail & Related papers (2022-03-27T03:23:32Z) - DeepSight: Mitigating Backdoor Attacks in Federated Learning Through
Deep Model Inspection [26.593268413299228]
Federated Learning (FL) allows multiple clients to collaboratively train a Neural Network (NN) model on their private data without revealing the data.
DeepSight is a novel model filtering approach for mitigating backdoor attacks.
We show that it can mitigate state-of-the-art backdoor attacks with a negligible impact on the model's performance on benign data.
arXiv Detail & Related papers (2022-01-03T17:10:07Z) - DeepSteal: Advanced Model Extractions Leveraging Efficient Weight
Stealing in Memories [26.067920958354]
One of the major threats to the privacy of Deep Neural Networks (DNNs) is model extraction attacks.
Recent studies show hardware-based side channel attacks can reveal internal knowledge about DNN models (e.g., model architectures)
We propose an advanced model extraction attack framework DeepSteal that effectively steals DNN weights with the aid of memory side-channel attack.
arXiv Detail & Related papers (2021-11-08T16:55:45Z) - AdvParams: An Active DNN Intellectual Property Protection Technique via
Adversarial Perturbation Based Parameter Encryption [10.223780756303196]
We propose an effective framework to actively protect the DNN IP from infringement.
Specifically, we encrypt the DNN model's parameters by perturbing them with well-crafted adversarial perturbations.
After the encryption, the positions of encrypted parameters and the values of the added adversarial perturbations form a secret key.
arXiv Detail & Related papers (2021-05-28T09:42:35Z) - HufuNet: Embedding the Left Piece as Watermark and Keeping the Right
Piece for Ownership Verification in Deep Neural Networks [16.388046449021466]
We propose a novel solution for watermarking deep neural networks (DNNs)
HufuNet is highly robust against model fine-tuning/pruning, kernels cutoff/supplement, functionality-equivalent attack, and fraudulent ownership claims.
arXiv Detail & Related papers (2021-03-25T06:55:22Z) - Cooling-Shrinking Attack: Blinding the Tracker with Imperceptible Noises [87.53808756910452]
A cooling-shrinking attack method is proposed to deceive state-of-the-art SiameseRPN-based trackers.
Our method has good transferability and is able to deceive other top-performance trackers such as DaSiamRPN, DaSiamRPN-UpdateNet, and DiMP.
arXiv Detail & Related papers (2020-03-21T07:13:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.