"And Then There Were None": Cracking White-box DNN Watermarks via
Invariant Neuron Transforms
- URL: http://arxiv.org/abs/2205.00199v1
- Date: Sat, 30 Apr 2022 08:33:32 GMT
- Title: "And Then There Were None": Cracking White-box DNN Watermarks via
Invariant Neuron Transforms
- Authors: Yifan Yan, Xudong Pan, Yining Wang, Mi Zhang, Min Yang
- Abstract summary: We present the first effective removal attack which cracks almost all the existing white-box watermarking schemes.
Our attack requires no prior knowledge on the training data distribution or the adopted watermark algorithms, and leaves model functionality intact.
- Score: 29.76685892624105
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Recently, how to protect the Intellectual Property (IP) of deep neural
networks (DNN) becomes a major concern for the AI industry. To combat potential
model piracy, recent works explore various watermarking strategies to embed
secret identity messages into the prediction behaviors or the internals (e.g.,
weights and neuron activation) of the target model. Sacrificing less
functionality and involving more knowledge about the target model, the latter
branch of watermarking schemes (i.e., white-box model watermarking) is claimed
to be accurate, credible and secure against most known watermark removal
attacks, with emerging research efforts and applications in the industry.
In this paper, we present the first effective removal attack which cracks
almost all the existing white-box watermarking schemes with provably no
performance overhead and no required prior knowledge. By analyzing these IP
protection mechanisms at the granularity of neurons, we for the first time
discover their common dependence on a set of fragile features of a local neuron
group, all of which can be arbitrarily tampered by our proposed chain of
invariant neuron transforms. On $9$ state-of-the-art white-box watermarking
schemes and a broad set of industry-level DNN architectures, our attack for the
first time reduces the embedded identity message in the protected models to be
almost random. Meanwhile, unlike known removal attacks, our attack requires no
prior knowledge on the training data distribution or the adopted watermark
algorithms, and leaves model functionality intact.
Related papers
- On the Weaknesses of Backdoor-based Model Watermarking: An Information-theoretic Perspective [39.676548104635096]
Safeguarding the intellectual property of machine learning models has emerged as a pressing concern in AI security.
Model watermarking is a powerful technique for protecting ownership of machine learning models.
We propose a novel model watermarking scheme, In-distribution Watermark Embedding (IWE), to overcome the limitations of existing method.
arXiv Detail & Related papers (2024-09-10T00:55:21Z) - DeepiSign-G: Generic Watermark to Stamp Hidden DNN Parameters for Self-contained Tracking [15.394110881491773]
DeepiSign-G is a versatile watermarking approach designed for comprehensive verification of leading DNN architectures, including CNNs and RNNs.
Unlike traditional hashing techniques, DeepiSign-G allows substantial metadata incorporation directly within the model, enabling detailed, self-contained tracking and verification.
We demonstrate DeepiSign-G's applicability across various architectures, including CNN models (VGG, ResNets, DenseNet) and RNNs (Text sentiment classifiers)
arXiv Detail & Related papers (2024-07-01T13:15:38Z) - DeepEclipse: How to Break White-Box DNN-Watermarking Schemes [60.472676088146436]
We present obfuscation techniques that significantly differ from the existing white-box watermarking removal schemes.
DeepEclipse can evade watermark detection without prior knowledge of the underlying watermarking scheme.
Our evaluation reveals that DeepEclipse excels in breaking multiple white-box watermarking schemes.
arXiv Detail & Related papers (2024-03-06T10:24:47Z) - Safe and Robust Watermark Injection with a Single OoD Image [90.71804273115585]
Training a high-performance deep neural network requires large amounts of data and computational resources.
We propose a safe and robust backdoor-based watermark injection technique.
We induce random perturbation of model parameters during watermark injection to defend against common watermark removal attacks.
arXiv Detail & Related papers (2023-09-04T19:58:35Z) - Rethinking White-Box Watermarks on Deep Learning Models under Neural
Structural Obfuscation [24.07604618918671]
Copyright protection for deep neural networks (DNNs) is an urgent need for AI corporations.
White-box watermarking is believed to be accurate, credible and secure against most known watermark removal attacks.
We present the first systematic study on how the mainstream white-box watermarks are commonly vulnerable to neural structural obfuscation with textitdummy neurons.
arXiv Detail & Related papers (2023-03-17T02:21:41Z) - Exploring Structure Consistency for Deep Model Watermarking [122.38456787761497]
The intellectual property (IP) of Deep neural networks (DNNs) can be easily stolen'' by surrogate model attack.
We propose a new watermarking methodology, namely structure consistency'', based on which a new deep structure-aligned model watermarking algorithm is designed.
arXiv Detail & Related papers (2021-08-05T04:27:15Z) - Deep Model Intellectual Property Protection via Deep Watermarking [122.87871873450014]
Deep neural networks are exposed to serious IP infringement risks.
Given a target deep model, if the attacker knows its full information, it can be easily stolen by fine-tuning.
We propose a new model watermarking framework for protecting deep networks trained for low-level computer vision or image processing tasks.
arXiv Detail & Related papers (2021-03-08T18:58:21Z) - Fine-tuning Is Not Enough: A Simple yet Effective Watermark Removal
Attack for DNN Models [72.9364216776529]
We propose a novel watermark removal attack from a different perspective.
We design a simple yet powerful transformation algorithm by combining imperceptible pattern embedding and spatial-level transformations.
Our attack can bypass state-of-the-art watermarking solutions with very high success rates.
arXiv Detail & Related papers (2020-09-18T09:14:54Z) - Neural Network Laundering: Removing Black-Box Backdoor Watermarks from
Deep Neural Networks [17.720400846604907]
We propose a neural network "laundering" algorithm to remove black-box backdoor watermarks from neural networks.
For all backdoor watermarking methods addressed in this paper, we find that the robustness of the watermark is significantly weaker than the original claims.
arXiv Detail & Related papers (2020-04-22T19:02:47Z) - Model Watermarking for Image Processing Networks [120.918532981871]
How to protect the intellectual property of deep models is a very important but seriously under-researched problem.
We propose the first model watermarking framework for protecting image processing models.
arXiv Detail & Related papers (2020-02-25T18:36:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.