Verifying Integrity of Deep Ensemble Models by Lossless Black-box
Watermarking with Sensitive Samples
- URL: http://arxiv.org/abs/2205.04145v2
- Date: Tue, 10 May 2022 02:03:56 GMT
- Title: Verifying Integrity of Deep Ensemble Models by Lossless Black-box
Watermarking with Sensitive Samples
- Authors: Lina Lin and Hanzhou Wu
- Abstract summary: We propose a novel black-box watermarking method for deep ensemble models (DEMs)
In the proposed method, a certain number of sensitive samples are carefully selected through mimicking real-world DEM attacks.
By analyzing the prediction results of the target DEM on these carefully crafted sensitive samples, we are able to verify the integrity of the target DEM.
- Score: 17.881686153284267
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: With the widespread use of deep neural networks (DNNs) in many areas, more
and more studies focus on protecting DNN models from intellectual property (IP)
infringement. Many existing methods apply digital watermarking to protect the
DNN models. The majority of them either embed a watermark directly into the
internal network structure/parameters or insert a zero-bit watermark by
fine-tuning a model to be protected with a set of so-called trigger samples.
Though these methods work very well, they were designed for individual DNN
models, which cannot be directly applied to deep ensemble models (DEMs) that
combine multiple DNN models to make the final decision. It motivates us to
propose a novel black-box watermarking method in this paper for DEMs, which can
be used for verifying the integrity of DEMs. In the proposed method, a certain
number of sensitive samples are carefully selected through mimicking real-world
DEM attacks and analyzing the prediction results of the sub-models of the
non-attacked DEM and the attacked DEM on the carefully crafted dataset. By
analyzing the prediction results of the target DEM on these carefully crafted
sensitive samples, we are able to verify the integrity of the target DEM.
Different from many previous methods, the proposed method does not modify the
original DEM to be protected, which indicates that the proposed method is
lossless. Experimental results have shown that the DEM integrity can be
reliably verified even if only one sub-model was attacked, which has good
potential in practice.
Related papers
- Lazy Layers to Make Fine-Tuned Diffusion Models More Traceable [70.77600345240867]
A novel arbitrary-in-arbitrary-out (AIAO) strategy makes watermarks resilient to fine-tuning-based removal.
Unlike the existing methods of designing a backdoor for the input/output space of diffusion models, in our method, we propose to embed the backdoor into the feature space of sampled subpaths.
Our empirical studies on the MS-COCO, AFHQ, LSUN, CUB-200, and DreamBooth datasets confirm the robustness of AIAO.
arXiv Detail & Related papers (2024-05-01T12:03:39Z) - Fragile Model Watermark for integrity protection: leveraging boundary volatility and sensitive sample-pairing [34.86809796164664]
Fragile model watermarks aim to prevent unexpected tampering that could lead models to make incorrect decisions.
Our approach employs a sample-pairing technique, placing the model boundaries between pairs of samples, while simultaneously maximizing logits.
This ensures that the model's decision results of sensitive samples change as much as possible and the Top-1 labels easily alter regardless of the direction it moves.
arXiv Detail & Related papers (2024-04-11T09:01:52Z) - Evaluating Similitude and Robustness of Deep Image Denoising Models via
Adversarial Attack [60.40356882897116]
Deep neural networks (DNNs) have shown superior performance compared to traditional image denoising algorithms.
In this paper, we propose an adversarial attack method named denoising-PGD which can successfully attack all the current deep denoising models.
arXiv Detail & Related papers (2023-06-28T09:30:59Z) - Reversible Quantization Index Modulation for Static Deep Neural Network
Watermarking [57.96787187733302]
Reversible data hiding (RDH) methods offer a potential solution, but existing approaches suffer from weaknesses in terms of usability, capacity, and fidelity.
We propose a novel RDH-based static DNN watermarking scheme using quantization index modulation (QIM)
Our scheme incorporates a novel approach based on a one-dimensional quantizer for watermark embedding.
arXiv Detail & Related papers (2023-05-29T04:39:17Z) - Watermarking for Out-of-distribution Detection [76.20630986010114]
Out-of-distribution (OOD) detection aims to identify OOD data based on representations extracted from well-trained deep models.
We propose a general methodology named watermarking in this paper.
We learn a unified pattern that is superimposed onto features of original data, and the model's detection capability is largely boosted after watermarking.
arXiv Detail & Related papers (2022-10-27T06:12:32Z) - InFIP: An Explainable DNN Intellectual Property Protection Method based
on Intrinsic Features [12.037142903022891]
We propose an interpretable intellectual property protection method for Deep Neural Networks (DNNs) based on explainable artificial intelligence.
The proposed method does not modify the DNN model, and the decision of the ownership verification is interpretable.
Experimental results demonstrate that the fingerprints can be successfully used to verify the ownership of the model.
arXiv Detail & Related papers (2022-10-14T03:12:36Z) - A Mask-Based Adversarial Defense Scheme [3.759725391906588]
Adversarial attacks hamper the functionality and accuracy of Deep Neural Networks (DNNs)
We propose a new Mask-based Adversarial Defense scheme (MAD) for DNNs to mitigate the negative effect from adversarial attacks.
arXiv Detail & Related papers (2022-04-21T12:55:27Z) - AdvParams: An Active DNN Intellectual Property Protection Technique via
Adversarial Perturbation Based Parameter Encryption [10.223780756303196]
We propose an effective framework to actively protect the DNN IP from infringement.
Specifically, we encrypt the DNN model's parameters by perturbing them with well-crafted adversarial perturbations.
After the encryption, the positions of encrypted parameters and the values of the added adversarial perturbations form a secret key.
arXiv Detail & Related papers (2021-05-28T09:42:35Z) - Firearm Detection via Convolutional Neural Networks: Comparing a
Semantic Segmentation Model Against End-to-End Solutions [68.8204255655161]
Threat detection of weapons and aggressive behavior from live video can be used for rapid detection and prevention of potentially deadly incidents.
One way for achieving this is through the use of artificial intelligence and, in particular, machine learning for image analysis.
We compare a traditional monolithic end-to-end deep learning model and a previously proposed model based on an ensemble of simpler neural networks detecting fire-weapons via semantic segmentation.
arXiv Detail & Related papers (2020-12-17T15:19:29Z) - Uncertainty Estimation Using a Single Deep Deterministic Neural Network [66.26231423824089]
We propose a method for training a deterministic deep model that can find and reject out of distribution data points at test time with a single forward pass.
We scale training in these with a novel loss function and centroid updating scheme and match the accuracy of softmax models.
arXiv Detail & Related papers (2020-03-04T12:27:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.