SoK: How Robust is Image Classification Deep Neural Network
Watermarking? (Extended Version)
- URL: http://arxiv.org/abs/2108.04974v1
- Date: Wed, 11 Aug 2021 00:23:33 GMT
- Title: SoK: How Robust is Image Classification Deep Neural Network
Watermarking? (Extended Version)
- Authors: Nils Lukas, Edward Jiang, Xinda Li, Florian Kerschbaum
- Abstract summary: We evaluate whether recently proposed watermarking schemes that claim robustness are robust against a large set of removal attacks.
None of the surveyed watermarking schemes is robust in practice datasets.
We show that watermarking schemes need to be evaluated against a more extensive set of removal attacks with a more realistic adversary model.
- Score: 16.708069984516964
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep Neural Network (DNN) watermarking is a method for provenance
verification of DNN models. Watermarking should be robust against watermark
removal attacks that derive a surrogate model that evades provenance
verification. Many watermarking schemes that claim robustness have been
proposed, but their robustness is only validated in isolation against a
relatively small set of attacks. There is no systematic, empirical evaluation
of these claims against a common, comprehensive set of removal attacks. This
uncertainty about a watermarking scheme's robustness causes difficulty to trust
their deployment in practice. In this paper, we evaluate whether recently
proposed watermarking schemes that claim robustness are robust against a large
set of removal attacks. We survey methods from the literature that (i) are
known removal attacks, (ii) derive surrogate models but have not been evaluated
as removal attacks, and (iii) novel removal attacks. Weight shifting and smooth
retraining are novel removal attacks adapted to the DNN watermarking schemes
surveyed in this paper. We propose taxonomies for watermarking schemes and
removal attacks. Our empirical evaluation includes an ablation study over sets
of parameters for each attack and watermarking scheme on the CIFAR-10 and
ImageNet datasets. Surprisingly, none of the surveyed watermarking schemes is
robust in practice. We find that schemes fail to withstand adaptive attacks and
known methods for deriving surrogate models that have not been evaluated as
removal attacks. This points to intrinsic flaws in how robustness is currently
evaluated. We show that watermarking schemes need to be evaluated against a
more extensive set of removal attacks with a more realistic adversary model.
Our source code and a complete dataset of evaluation results are publicly
available, which allows to independently verify our conclusions.
Related papers
- Robustness of Watermarking on Text-to-Image Diffusion Models [9.277492743469235]
We investigate the robustness of generative watermarking, which is created from the integration of watermarking embedding and text-to-image generation processing.
We found that generative watermarking methods are robust to direct evasion attacks, like discriminator-based attacks, or manipulation based on the edge information in edge prediction-based attacks but vulnerable to malicious fine-tuning.
arXiv Detail & Related papers (2024-08-04T13:59:09Z) - Certifiably Robust Image Watermark [57.546016845801134]
Generative AI raises many societal concerns such as boosting disinformation and propaganda campaigns.
Watermarking AI-generated content is a key technology to address these concerns.
We propose the first image watermarks with certified robustness guarantees against removal and forgery attacks.
arXiv Detail & Related papers (2024-07-04T17:56:04Z) - Towards Robust Model Watermark via Reducing Parametric Vulnerability [57.66709830576457]
backdoor-based ownership verification becomes popular recently, in which the model owner can watermark the model.
We propose a mini-max formulation to find these watermark-removed models and recover their watermark behavior.
Our method improves the robustness of the model watermarking against parametric changes and numerous watermark-removal attacks.
arXiv Detail & Related papers (2023-09-09T12:46:08Z) - Safe and Robust Watermark Injection with a Single OoD Image [90.71804273115585]
Training a high-performance deep neural network requires large amounts of data and computational resources.
We propose a safe and robust backdoor-based watermark injection technique.
We induce random perturbation of model parameters during watermark injection to defend against common watermark removal attacks.
arXiv Detail & Related papers (2023-09-04T19:58:35Z) - Exploring Structure Consistency for Deep Model Watermarking [122.38456787761497]
The intellectual property (IP) of Deep neural networks (DNNs) can be easily stolen'' by surrogate model attack.
We propose a new watermarking methodology, namely structure consistency'', based on which a new deep structure-aligned model watermarking algorithm is designed.
arXiv Detail & Related papers (2021-08-05T04:27:15Z) - Evaluating the Robustness of Trigger Set-Based Watermarks Embedded in
Deep Neural Networks [22.614495877481144]
State-of-the-art trigger set-based watermarking algorithms do not achieve their designed goal of proving ownership.
We propose novel adaptive attacks that harness the adversary's knowledge of the underlying watermarking algorithm of a target model.
arXiv Detail & Related papers (2021-06-18T14:23:55Z) - Reversible Watermarking in Deep Convolutional Neural Networks for
Integrity Authentication [78.165255859254]
We propose a reversible watermarking algorithm for integrity authentication.
The influence of embedding reversible watermarking on the classification performance is less than 0.5%.
At the same time, the integrity of the model can be verified by applying the reversible watermarking.
arXiv Detail & Related papers (2021-04-09T09:32:21Z) - Fine-tuning Is Not Enough: A Simple yet Effective Watermark Removal
Attack for DNN Models [72.9364216776529]
We propose a novel watermark removal attack from a different perspective.
We design a simple yet powerful transformation algorithm by combining imperceptible pattern embedding and spatial-level transformations.
Our attack can bypass state-of-the-art watermarking solutions with very high success rates.
arXiv Detail & Related papers (2020-09-18T09:14:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.