DeepEclipse: How to Break White-Box DNN-Watermarking Schemes
- URL: http://arxiv.org/abs/2403.03590v1
- Date: Wed, 6 Mar 2024 10:24:47 GMT
- Title: DeepEclipse: How to Break White-Box DNN-Watermarking Schemes
- Authors: Alessandro Pegoraro, Carlotta Segna, Kavita Kumari, Ahmad-Reza Sadeghi
- Abstract summary: We present obfuscation techniques that significantly differ from the existing white-box watermarking removal schemes.
DeepEclipse can evade watermark detection without prior knowledge of the underlying watermarking scheme.
Our evaluation reveals that DeepEclipse excels in breaking multiple white-box watermarking schemes.
- Score: 60.472676088146436
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep Learning (DL) models have become crucial in digital transformation, thus
raising concerns about their intellectual property rights. Different
watermarking techniques have been developed to protect Deep Neural Networks
(DNNs) from IP infringement, creating a competitive field for DNN watermarking
and removal methods. The predominant watermarking schemes use white-box
techniques, which involve modifying weights by adding a unique signature to
specific DNN layers. On the other hand, existing attacks on white-box
watermarking usually require knowledge of the specific deployed watermarking
scheme or access to the underlying data for further training and fine-tuning.
We propose DeepEclipse, a novel and unified framework designed to remove
white-box watermarks. We present obfuscation techniques that significantly
differ from the existing white-box watermarking removal schemes. DeepEclipse
can evade watermark detection without prior knowledge of the underlying
watermarking scheme, additional data, or training and fine-tuning. Our
evaluation reveals that DeepEclipse excels in breaking multiple white-box
watermarking schemes, reducing watermark detection to random guessing while
maintaining a similar model accuracy as the original one. Our framework
showcases a promising solution to address the ongoing DNN watermark protection
and removal challenges.
Related papers
- ESpeW: Robust Copyright Protection for LLM-based EaaS via Embedding-Specific Watermark [50.08021440235581]
Embeds as a Service (Eding) is emerging as a crucial role in AI applications.
Eding is vulnerable to model extraction attacks, highlighting the urgent need for copyright protection.
We propose a novel embedding-specific watermarking (ESpeW) mechanism to offer robust copyright protection for Eding.
arXiv Detail & Related papers (2024-10-23T04:34:49Z) - FreeMark: A Non-Invasive White-Box Watermarking for Deep Neural Networks [5.937758152593733]
FreeMark is a novel framework for watermarking deep neural networks (DNNs)
Unlike traditional watermarking methods, FreeMark innovatively generates secret keys from a pre-generated watermark vector and the host model using gradient descent.
Experiments demonstrate that FreeMark effectively resists various watermark removal attacks while maintaining high watermark capacity.
arXiv Detail & Related papers (2024-09-16T05:05:03Z) - DLOVE: A new Security Evaluation Tool for Deep Learning Based Watermarking Techniques [1.8416014644193066]
Recent developments in Deep Neural Network (DNN) based watermarking techniques have shown remarkable performance.
In this paper, we performed a detailed security analysis of different DNN-based watermarking techniques.
We propose a new class of attack called the Deep Learning-based OVErwriting (DLOVE) attack.
arXiv Detail & Related papers (2024-07-09T05:18:14Z) - A self-supervised CNN for image watermark removal [102.94929746450902]
We propose a self-supervised convolutional neural network (CNN) in image watermark removal (SWCNN)
SWCNN uses a self-supervised way to construct reference watermarked images rather than given paired training samples, according to watermark distribution.
Taking into account texture information, a mixed loss is exploited to improve visual effects of image watermark removal.
arXiv Detail & Related papers (2024-03-09T05:59:48Z) - Neural Dehydration: Effective Erasure of Black-box Watermarks from DNNs with Limited Data [23.90041044463682]
We propose a watermark-agnostic removal attack called textscNeural Dehydration (textitabbrev. textscDehydra)
Our attack pipeline exploits the internals of the protected model to recover and unlearn the watermark message.
We achieve strong removal effectiveness across all the covered watermarks, preserving at least $90%$ of the stolen model utility.
arXiv Detail & Related papers (2023-09-07T03:16:03Z) - Certified Neural Network Watermarks with Randomized Smoothing [64.86178395240469]
We propose a certifiable watermarking method for deep learning models.
We show that our watermark is guaranteed to be unremovable unless the model parameters are changed by more than a certain l2 threshold.
Our watermark is also empirically more robust compared to previous watermarking methods.
arXiv Detail & Related papers (2022-07-16T16:06:59Z) - Exploring Structure Consistency for Deep Model Watermarking [122.38456787761497]
The intellectual property (IP) of Deep neural networks (DNNs) can be easily stolen'' by surrogate model attack.
We propose a new watermarking methodology, namely structure consistency'', based on which a new deep structure-aligned model watermarking algorithm is designed.
arXiv Detail & Related papers (2021-08-05T04:27:15Z) - Piracy-Resistant DNN Watermarking by Block-Wise Image Transformation
with Secret Key [15.483078145498085]
The proposed method embeds a watermark pattern in a model by using learnable transformed images.
It is piracy-resistant, so the original watermark cannot be overwritten by a pirated watermark.
The results show that it was resilient against fine-tuning and pruning attacks while maintaining a high watermark-detection accuracy.
arXiv Detail & Related papers (2021-04-09T08:21:53Z) - Fine-tuning Is Not Enough: A Simple yet Effective Watermark Removal
Attack for DNN Models [72.9364216776529]
We propose a novel watermark removal attack from a different perspective.
We design a simple yet powerful transformation algorithm by combining imperceptible pattern embedding and spatial-level transformations.
Our attack can bypass state-of-the-art watermarking solutions with very high success rates.
arXiv Detail & Related papers (2020-09-18T09:14:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.