Detect and remove watermark in deep neural networks via generative
adversarial networks
- URL: http://arxiv.org/abs/2106.08104v1
- Date: Tue, 15 Jun 2021 12:45:22 GMT
- Title: Detect and remove watermark in deep neural networks via generative
adversarial networks
- Authors: Haoqi Wang, Mingfu Xue, Shichang Sun, Yushu Zhang, Jian Wang, Weiqiang
Liu
- Abstract summary: We propose a scheme to detect and remove watermark in deep neural networks via generative adversarial networks (GAN)
In the first phase, we use the GAN and few clean images to detect and reverse the watermark in the DNN model.
In the second phase, we fine-tune the watermarked DNN based on the reversed backdoor images.
- Score: 10.058070050660104
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep neural networks (DNN) have achieved remarkable performance in various
fields. However, training a DNN model from scratch requires a lot of computing
resources and training data. It is difficult for most individual users to
obtain such computing resources and training data. Model copyright infringement
is an emerging problem in recent years. For instance, pre-trained models may be
stolen or abuse by illegal users without the authorization of the model owner.
Recently, many works on protecting the intellectual property of DNN models have
been proposed. In these works, embedding watermarks into DNN based on backdoor
is one of the widely used methods. However, when the DNN model is stolen, the
backdoor-based watermark may face the risk of being detected and removed by an
adversary. In this paper, we propose a scheme to detect and remove watermark in
deep neural networks via generative adversarial networks (GAN). We demonstrate
that the backdoor-based DNN watermarks are vulnerable to the proposed GAN-based
watermark removal attack. The proposed attack method includes two phases. In
the first phase, we use the GAN and few clean images to detect and reverse the
watermark in the DNN model. In the second phase, we fine-tune the watermarked
DNN based on the reversed backdoor images. Experimental evaluations on the
MNIST and CIFAR10 datasets demonstrate that, the proposed method can
effectively remove about 98% of the watermark in DNN models, as the watermark
retention rate reduces from 100% to less than 2% after applying the proposed
attack. In the meantime, the proposed attack hardly affects the model's
performance. The test accuracy of the watermarked DNN on the MNIST and the
CIFAR10 datasets drops by less than 1% and 3%, respectively.
Related papers
- FreeMark: A Non-Invasive White-Box Watermarking for Deep Neural Networks [5.937758152593733]
FreeMark is a novel framework for watermarking deep neural networks (DNNs)
Unlike traditional watermarking methods, FreeMark innovatively generates secret keys from a pre-generated watermark vector and the host model using gradient descent.
Experiments demonstrate that FreeMark effectively resists various watermark removal attacks while maintaining high watermark capacity.
arXiv Detail & Related papers (2024-09-16T05:05:03Z) - A self-supervised CNN for image watermark removal [102.94929746450902]
We propose a self-supervised convolutional neural network (CNN) in image watermark removal (SWCNN)
SWCNN uses a self-supervised way to construct reference watermarked images rather than given paired training samples, according to watermark distribution.
Taking into account texture information, a mixed loss is exploited to improve visual effects of image watermark removal.
arXiv Detail & Related papers (2024-03-09T05:59:48Z) - DeepEclipse: How to Break White-Box DNN-Watermarking Schemes [60.472676088146436]
We present obfuscation techniques that significantly differ from the existing white-box watermarking removal schemes.
DeepEclipse can evade watermark detection without prior knowledge of the underlying watermarking scheme.
Our evaluation reveals that DeepEclipse excels in breaking multiple white-box watermarking schemes.
arXiv Detail & Related papers (2024-03-06T10:24:47Z) - Towards Robust Model Watermark via Reducing Parametric Vulnerability [57.66709830576457]
backdoor-based ownership verification becomes popular recently, in which the model owner can watermark the model.
We propose a mini-max formulation to find these watermark-removed models and recover their watermark behavior.
Our method improves the robustness of the model watermarking against parametric changes and numerous watermark-removal attacks.
arXiv Detail & Related papers (2023-09-09T12:46:08Z) - Rethinking White-Box Watermarks on Deep Learning Models under Neural
Structural Obfuscation [24.07604618918671]
Copyright protection for deep neural networks (DNNs) is an urgent need for AI corporations.
White-box watermarking is believed to be accurate, credible and secure against most known watermark removal attacks.
We present the first systematic study on how the mainstream white-box watermarks are commonly vulnerable to neural structural obfuscation with textitdummy neurons.
arXiv Detail & Related papers (2023-03-17T02:21:41Z) - On Function-Coupled Watermarks for Deep Neural Networks [15.478746926391146]
We propose a novel DNN watermarking solution that can effectively defend against watermark removal attacks.
Our key insight is to enhance the coupling of the watermark and model functionalities.
Results show a 100% watermark authentication success rate under aggressive watermark removal attacks.
arXiv Detail & Related papers (2023-02-08T05:55:16Z) - Untargeted Backdoor Watermark: Towards Harmless and Stealthy Dataset
Copyright Protection [69.59980270078067]
We explore the untargeted backdoor watermarking scheme, where the abnormal model behaviors are not deterministic.
We also discuss how to use the proposed untargeted backdoor watermark for dataset ownership verification.
arXiv Detail & Related papers (2022-09-27T12:56:56Z) - Watermarking Graph Neural Networks based on Backdoor Attacks [10.844454900508566]
We present a watermarking framework for Graph Neural Networks (GNNs) for both graph and node classification tasks.
Our framework can verify the ownership of GNN models with a very high probability (around $100%$) for both tasks.
arXiv Detail & Related papers (2021-10-21T09:59:59Z) - Exploring Structure Consistency for Deep Model Watermarking [122.38456787761497]
The intellectual property (IP) of Deep neural networks (DNNs) can be easily stolen'' by surrogate model attack.
We propose a new watermarking methodology, namely structure consistency'', based on which a new deep structure-aligned model watermarking algorithm is designed.
arXiv Detail & Related papers (2021-08-05T04:27:15Z) - Reversible Watermarking in Deep Convolutional Neural Networks for
Integrity Authentication [78.165255859254]
We propose a reversible watermarking algorithm for integrity authentication.
The influence of embedding reversible watermarking on the classification performance is less than 0.5%.
At the same time, the integrity of the model can be verified by applying the reversible watermarking.
arXiv Detail & Related papers (2021-04-09T09:32:21Z) - HufuNet: Embedding the Left Piece as Watermark and Keeping the Right
Piece for Ownership Verification in Deep Neural Networks [16.388046449021466]
We propose a novel solution for watermarking deep neural networks (DNNs)
HufuNet is highly robust against model fine-tuning/pruning, kernels cutoff/supplement, functionality-equivalent attack, and fraudulent ownership claims.
arXiv Detail & Related papers (2021-03-25T06:55:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.