Related papers: NNSplitter: An Active Defense Solution for DNN Model via Automated Weight Obfuscation

NNSplitter: An Active Defense Solution for DNN Model via Automated Weight Obfuscation

URL: http://arxiv.org/abs/2305.00097v3
Date: Wed, 16 Aug 2023 21:25:10 GMT
Title: NNSplitter: An Active Defense Solution for DNN Model via Automated Weight Obfuscation
Authors: Tong Zhou, Yukui Luo, Shaolei Ren, Xiaolin Xu
Abstract summary: Deep neural network (DNN) models have been protected by techniques like watermarking. In this work, we propose an active model IP protection scheme, namely NNSplitter. NNSplitter actively protects the model by splitting it into two parts: the obfuscated model that performs poorly due to weight obfuscation, and the model secrets.
Score: 26.12728348169104
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: As a type of valuable intellectual property (IP), deep neural network (DNN) models have been protected by techniques like watermarking. However, such passive model protection cannot fully prevent model abuse. In this work, we propose an active model IP protection scheme, namely NNSplitter, which actively protects the model by splitting it into two parts: the obfuscated model that performs poorly due to weight obfuscation, and the model secrets consisting of the indexes and original values of the obfuscated weights, which can only be accessed by authorized users with the support of the trusted execution environment. Experimental results demonstrate the effectiveness of NNSplitter, e.g., by only modifying 275 out of over 11 million (i.e., 0.002%) weights, the accuracy of the obfuscated ResNet-18 model on CIFAR-10 can drop to 10%. Moreover, NNSplitter is stealthy and resilient against norm clipping and fine-tuning attacks, making it an appealing solution for DNN model protection. The code is available at: https://github.com/Tongzhou0101/NNSplitter.

Related papers

Class-Conditional Neural Polarizer: A Lightweight and Effective Backdoor Defense by Purifying Poisoned Features [58.44299620519961]
Recent studies have highlighted the vulnerability of deep neural networks to backdoor attacks. We propose a lightweight backdoor defense approach, called neural polarizer-based defense (CNPD) CNPD integrates the backdoored model's predicted label with the features to be purified.
arXiv Detail & Related papers (2025-02-23T11:11:16Z)
A Hard-Label Cryptanalytic Extraction of Non-Fully Connected Deep Neural Networks using Side-Channel Attacks [0.7499722271664147]
Protection of the intellectual property of Deep Neural Networks (DNNs) is still an issue and an emerging research field. Recent works have successfully extracted fully-connected DNNs using cryptanalytic methods in hard-label settings. We introduce a new end-to-end attack framework designed for model extraction of embedded DNNs with high fidelity.
arXiv Detail & Related papers (2024-11-15T13:19:59Z)
Isolation and Induction: Training Robust Deep Neural Networks against Model Stealing Attacks [51.51023951695014]
Existing model stealing defenses add deceptive perturbations to the victim's posterior probabilities to mislead the attackers. This paper proposes Isolation and Induction (InI), a novel and effective training framework for model stealing defenses. In contrast to adding perturbations over model predictions that harm the benign accuracy, we train models to produce uninformative outputs against stealing queries.
arXiv Detail & Related papers (2023-08-02T05:54:01Z)
Are aligned neural networks adversarially aligned? [93.91072860401856]
adversarial users can construct inputs which circumvent attempts at alignment. We show that existing NLP-based optimization attacks are insufficiently powerful to reliably attack aligned text models. We conjecture that improved NLP attacks may demonstrate this same level of adversarial control over text-only models.
arXiv Detail & Related papers (2023-06-26T17:18:44Z)
Publishing Efficient On-device Models Increases Adversarial Vulnerability [58.6975494957865]
In this paper, we study the security considerations of publishing on-device variants of large-scale models. We first show that an adversary can exploit on-device models to make attacking the large models easier. We then show that the vulnerability increases as the similarity between a full-scale and its efficient model increase.
arXiv Detail & Related papers (2022-12-28T05:05:58Z)
MOVE: Effective and Harmless Ownership Verification via Embedded External Features [109.19238806106426]
We propose an effective and harmless model ownership verification (MOVE) to defend against different types of model stealing simultaneously. We conduct the ownership verification by verifying whether a suspicious model contains the knowledge of defender-specified external features. In particular, we develop our MOVE method under both white-box and black-box settings to provide comprehensive model protection.
arXiv Detail & Related papers (2022-08-04T02:22:29Z)
How to Robustify Black-Box ML Models? A Zeroth-Order Optimization Perspective [74.47093382436823]
We address the problem of black-box defense: How to robustify a black-box model using just input queries and output feedback? We propose a general notion of defensive operation that can be applied to black-box models, and design it through the lens of denoised smoothing (DS) We empirically show that ZO-AE-DS can achieve improved accuracy, certified robustness, and query complexity over existing baselines.
arXiv Detail & Related papers (2022-03-27T03:23:32Z)
DeepSight: Mitigating Backdoor Attacks in Federated Learning Through Deep Model Inspection [26.593268413299228]
Federated Learning (FL) allows multiple clients to collaboratively train a Neural Network (NN) model on their private data without revealing the data. DeepSight is a novel model filtering approach for mitigating backdoor attacks. We show that it can mitigate state-of-the-art backdoor attacks with a negligible impact on the model's performance on benign data.
arXiv Detail & Related papers (2022-01-03T17:10:07Z)
DeepSteal: Advanced Model Extractions Leveraging Efficient Weight Stealing in Memories [26.067920958354]
One of the major threats to the privacy of Deep Neural Networks (DNNs) is model extraction attacks. Recent studies show hardware-based side channel attacks can reveal internal knowledge about DNN models (e.g., model architectures) We propose an advanced model extraction attack framework DeepSteal that effectively steals DNN weights with the aid of memory side-channel attack.
arXiv Detail & Related papers (2021-11-08T16:55:45Z)
AdvParams: An Active DNN Intellectual Property Protection Technique via Adversarial Perturbation Based Parameter Encryption [10.223780756303196]
We propose an effective framework to actively protect the DNN IP from infringement. Specifically, we encrypt the DNN model's parameters by perturbing them with well-crafted adversarial perturbations. After the encryption, the positions of encrypted parameters and the values of the added adversarial perturbations form a secret key.
arXiv Detail & Related papers (2021-05-28T09:42:35Z)
HufuNet: Embedding the Left Piece as Watermark and Keeping the Right Piece for Ownership Verification in Deep Neural Networks [16.388046449021466]
We propose a novel solution for watermarking deep neural networks (DNNs) HufuNet is highly robust against model fine-tuning/pruning, kernels cutoff/supplement, functionality-equivalent attack, and fraudulent ownership claims.
arXiv Detail & Related papers (2021-03-25T06:55:22Z)
Cooling-Shrinking Attack: Blinding the Tracker with Imperceptible Noises [87.53808756910452]
A cooling-shrinking attack method is proposed to deceive state-of-the-art SiameseRPN-based trackers. Our method has good transferability and is able to deceive other top-performance trackers such as DaSiamRPN, DaSiamRPN-UpdateNet, and DiMP.
arXiv Detail & Related papers (2020-03-21T07:13:40Z)

This list is automatically generated from the titles and abstracts of the papers in this site.