Removing Backdoor-Based Watermarks in Neural Networks with Limited Data
- URL: http://arxiv.org/abs/2008.00407v2
- Date: Sat, 8 Aug 2020 03:31:14 GMT
- Title: Removing Backdoor-Based Watermarks in Neural Networks with Limited Data
- Authors: Xuankai Liu, Fengting Li, Bihan Wen, Qi Li
- Abstract summary: Trading deep models is highly demanded and lucrative nowadays.
naive trading schemes typically involve potential risks related to copyright and trustworthiness issues.
We propose a novel backdoor-based watermark removal framework using limited data, dubbed WILD.
- Score: 26.050649487499626
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep neural networks have been widely applied and achieved great success in
various fields. As training deep models usually consumes massive data and
computational resources, trading the trained deep models is highly demanded and
lucrative nowadays. Unfortunately, the naive trading schemes typically involves
potential risks related to copyright and trustworthiness issues, e.g., a sold
model can be illegally resold to others without further authorization to reap
huge profits. To tackle this problem, various watermarking techniques are
proposed to protect the model intellectual property, amongst which the
backdoor-based watermarking is the most commonly-used one. However, the
robustness of these watermarking approaches is not well evaluated under
realistic settings, such as limited in-distribution data availability and
agnostic of watermarking patterns. In this paper, we benchmark the robustness
of watermarking, and propose a novel backdoor-based watermark removal framework
using limited data, dubbed WILD. The proposed WILD removes the watermarks of
deep models with only a small portion of training data, and the output model
can perform the same as models trained from scratch without watermarks
injected. In particular, a novel data augmentation method is utilized to mimic
the behavior of watermark triggers. Combining with the distribution alignment
between the normal and perturbed (e.g., occluded) data in the feature space,
our approach generalizes well on all typical types of trigger contents. The
experimental results demonstrate that our approach can effectively remove the
watermarks without compromising the deep model performance for the original
task with the limited access to training data.
Related papers
- Towards Robust Model Watermark via Reducing Parametric Vulnerability [57.66709830576457]
backdoor-based ownership verification becomes popular recently, in which the model owner can watermark the model.
We propose a mini-max formulation to find these watermark-removed models and recover their watermark behavior.
Our method improves the robustness of the model watermarking against parametric changes and numerous watermark-removal attacks.
arXiv Detail & Related papers (2023-09-09T12:46:08Z) - Neural Dehydration: Effective Erasure of Black-box Watermarks from DNNs with Limited Data [23.90041044463682]
We propose a watermark-agnostic removal attack called textscNeural Dehydration (textitabbrev. textscDehydra)
Our attack pipeline exploits the internals of the protected model to recover and unlearn the watermark message.
We achieve strong removal effectiveness across all the covered watermarks, preserving at least $90%$ of the stolen model utility.
arXiv Detail & Related papers (2023-09-07T03:16:03Z) - Safe and Robust Watermark Injection with a Single OoD Image [90.71804273115585]
Training a high-performance deep neural network requires large amounts of data and computational resources.
We propose a safe and robust backdoor-based watermark injection technique.
We induce random perturbation of model parameters during watermark injection to defend against common watermark removal attacks.
arXiv Detail & Related papers (2023-09-04T19:58:35Z) - Did You Train on My Dataset? Towards Public Dataset Protection with
Clean-Label Backdoor Watermarking [54.40184736491652]
We propose a backdoor-based watermarking approach that serves as a general framework for safeguarding public-available data.
By inserting a small number of watermarking samples into the dataset, our approach enables the learning model to implicitly learn a secret function set by defenders.
This hidden function can then be used as a watermark to track down third-party models that use the dataset illegally.
arXiv Detail & Related papers (2023-03-20T21:54:30Z) - On Function-Coupled Watermarks for Deep Neural Networks [15.478746926391146]
We propose a novel DNN watermarking solution that can effectively defend against watermark removal attacks.
Our key insight is to enhance the coupling of the watermark and model functionalities.
Results show a 100% watermark authentication success rate under aggressive watermark removal attacks.
arXiv Detail & Related papers (2023-02-08T05:55:16Z) - Exploring Structure Consistency for Deep Model Watermarking [122.38456787761497]
The intellectual property (IP) of Deep neural networks (DNNs) can be easily stolen'' by surrogate model attack.
We propose a new watermarking methodology, namely structure consistency'', based on which a new deep structure-aligned model watermarking algorithm is designed.
arXiv Detail & Related papers (2021-08-05T04:27:15Z) - Reversible Watermarking in Deep Convolutional Neural Networks for
Integrity Authentication [78.165255859254]
We propose a reversible watermarking algorithm for integrity authentication.
The influence of embedding reversible watermarking on the classification performance is less than 0.5%.
At the same time, the integrity of the model can be verified by applying the reversible watermarking.
arXiv Detail & Related papers (2021-04-09T09:32:21Z) - Model Watermarking for Image Processing Networks [120.918532981871]
How to protect the intellectual property of deep models is a very important but seriously under-researched problem.
We propose the first model watermarking framework for protecting image processing models.
arXiv Detail & Related papers (2020-02-25T18:36:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.