Related papers: Detect and remove watermark in deep neural networks via generative adversarial networks

Detect and remove watermark in deep neural networks via generative adversarial networks

URL: http://arxiv.org/abs/2106.08104v1
Date: Tue, 15 Jun 2021 12:45:22 GMT
Title: Detect and remove watermark in deep neural networks via generative adversarial networks
Authors: Haoqi Wang, Mingfu Xue, Shichang Sun, Yushu Zhang, Jian Wang, Weiqiang Liu
Abstract summary: We propose a scheme to detect and remove watermark in deep neural networks via generative adversarial networks (GAN) In the first phase, we use the GAN and few clean images to detect and reverse the watermark in the DNN model. In the second phase, we fine-tune the watermarked DNN based on the reversed backdoor images.
Score: 10.058070050660104
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Deep neural networks (DNN) have achieved remarkable performance in various fields. However, training a DNN model from scratch requires a lot of computing resources and training data. It is difficult for most individual users to obtain such computing resources and training data. Model copyright infringement is an emerging problem in recent years. For instance, pre-trained models may be stolen or abuse by illegal users without the authorization of the model owner. Recently, many works on protecting the intellectual property of DNN models have been proposed. In these works, embedding watermarks into DNN based on backdoor is one of the widely used methods. However, when the DNN model is stolen, the backdoor-based watermark may face the risk of being detected and removed by an adversary. In this paper, we propose a scheme to detect and remove watermark in deep neural networks via generative adversarial networks (GAN). We demonstrate that the backdoor-based DNN watermarks are vulnerable to the proposed GAN-based watermark removal attack. The proposed attack method includes two phases. In the first phase, we use the GAN and few clean images to detect and reverse the watermark in the DNN model. In the second phase, we fine-tune the watermarked DNN based on the reversed backdoor images. Experimental evaluations on the MNIST and CIFAR10 datasets demonstrate that, the proposed method can effectively remove about 98% of the watermark in DNN models, as the watermark retention rate reduces from 100% to less than 2% after applying the proposed attack. In the meantime, the proposed attack hardly affects the model's performance. The test accuracy of the watermarked DNN on the MNIST and the CIFAR10 datasets drops by less than 1% and 3%, respectively.

Related papers

Persistence of Backdoor-based Watermarks for Neural Networks: A Comprehensive Evaluation [3.1858340237924776]
backdoor-based watermarks have been actively developed in recent years to preserve proprietary rights. In this paper, we evaluate the persistence of recent backdoor-based watermarks within neural networks in the scenario of fine-tuning. We propose/develop a novel data-driven idea to restore watermark after fine-tuning without exposing the trigger set.
arXiv Detail & Related papers (2025-01-06T01:15:35Z)
FreeMark: A Non-Invasive White-Box Watermarking for Deep Neural Networks [5.937758152593733]
FreeMark is a novel framework for watermarking deep neural networks (DNNs) Unlike traditional watermarking methods, FreeMark innovatively generates secret keys from a pre-generated watermark vector and the host model using gradient descent. Experiments demonstrate that FreeMark effectively resists various watermark removal attacks while maintaining high watermark capacity.
arXiv Detail & Related papers (2024-09-16T05:05:03Z)
A self-supervised CNN for image watermark removal [102.94929746450902]
We propose a self-supervised convolutional neural network (CNN) in image watermark removal (SWCNN) SWCNN uses a self-supervised way to construct reference watermarked images rather than given paired training samples, according to watermark distribution. Taking into account texture information, a mixed loss is exploited to improve visual effects of image watermark removal.
arXiv Detail & Related papers (2024-03-09T05:59:48Z)
DeepEclipse: How to Break White-Box DNN-Watermarking Schemes [60.472676088146436]
We present obfuscation techniques that significantly differ from the existing white-box watermarking removal schemes. DeepEclipse can evade watermark detection without prior knowledge of the underlying watermarking scheme. Our evaluation reveals that DeepEclipse excels in breaking multiple white-box watermarking schemes.
arXiv Detail & Related papers (2024-03-06T10:24:47Z)
Towards Robust Model Watermark via Reducing Parametric Vulnerability [57.66709830576457]
backdoor-based ownership verification becomes popular recently, in which the model owner can watermark the model. We propose a mini-max formulation to find these watermark-removed models and recover their watermark behavior. Our method improves the robustness of the model watermarking against parametric changes and numerous watermark-removal attacks.
arXiv Detail & Related papers (2023-09-09T12:46:08Z)
Rethinking White-Box Watermarks on Deep Learning Models under Neural Structural Obfuscation [24.07604618918671]
Copyright protection for deep neural networks (DNNs) is an urgent need for AI corporations. White-box watermarking is believed to be accurate, credible and secure against most known watermark removal attacks. We present the first systematic study on how the mainstream white-box watermarks are commonly vulnerable to neural structural obfuscation with textitdummy neurons.
arXiv Detail & Related papers (2023-03-17T02:21:41Z)
On Function-Coupled Watermarks for Deep Neural Networks [15.478746926391146]
We propose a novel DNN watermarking solution that can effectively defend against watermark removal attacks. Our key insight is to enhance the coupling of the watermark and model functionalities. Results show a 100% watermark authentication success rate under aggressive watermark removal attacks.
arXiv Detail & Related papers (2023-02-08T05:55:16Z)
Untargeted Backdoor Watermark: Towards Harmless and Stealthy Dataset Copyright Protection [69.59980270078067]
We explore the untargeted backdoor watermarking scheme, where the abnormal model behaviors are not deterministic. We also discuss how to use the proposed untargeted backdoor watermark for dataset ownership verification.
arXiv Detail & Related papers (2022-09-27T12:56:56Z)
Watermarking Graph Neural Networks based on Backdoor Attacks [10.844454900508566]
We present a watermarking framework for Graph Neural Networks (GNNs) for both graph and node classification tasks. Our framework can verify the ownership of GNN models with a very high probability (around $100%$) for both tasks.
arXiv Detail & Related papers (2021-10-21T09:59:59Z)
Exploring Structure Consistency for Deep Model Watermarking [122.38456787761497]
The intellectual property (IP) of Deep neural networks (DNNs) can be easily stolen'' by surrogate model attack. We propose a new watermarking methodology, namely structure consistency'', based on which a new deep structure-aligned model watermarking algorithm is designed.
arXiv Detail & Related papers (2021-08-05T04:27:15Z)
Reversible Watermarking in Deep Convolutional Neural Networks for Integrity Authentication [78.165255859254]
We propose a reversible watermarking algorithm for integrity authentication. The influence of embedding reversible watermarking on the classification performance is less than 0.5%. At the same time, the integrity of the model can be verified by applying the reversible watermarking.
arXiv Detail & Related papers (2021-04-09T09:32:21Z)
HufuNet: Embedding the Left Piece as Watermark and Keeping the Right Piece for Ownership Verification in Deep Neural Networks [16.388046449021466]
We propose a novel solution for watermarking deep neural networks (DNNs) HufuNet is highly robust against model fine-tuning/pruning, kernels cutoff/supplement, functionality-equivalent attack, and fraudulent ownership claims.
arXiv Detail & Related papers (2021-03-25T06:55:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.