SEW: Strengthening Robustness of Black-box DNN Watermarking via Specificity Enhancement
- URL: http://arxiv.org/abs/2602.03377v1
- Date: Tue, 03 Feb 2026 10:55:27 GMT
- Title: SEW: Strengthening Robustness of Black-box DNN Watermarking via Specificity Enhancement
- Authors: Huming Qiu, Mi Zhang, Junjie Sun, Peiyi Chen, Xiaohan Zhang, Min Yang,
- Abstract summary: We introduce Specificity-Enhanced Watermarking (SEW), a new method that improves specificity by reducing the association between the watermark and approximate keys.<n>SEW effectively defends against six state-of-the-art removal attacks, while maintaining model usability and watermark verification performance.
- Score: 19.10516412427928
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: To ensure the responsible distribution and use of open-source deep neural networks (DNNs), DNN watermarking has become a crucial technique to trace and verify unauthorized model replication or misuse. In practice, black-box watermarks manifest as specific predictive behaviors for specially crafted samples. However, due to the generalization nature of DNNs, the keys to extracting the watermark message are not unique, which would provide attackers with more opportunities. Advanced attack techniques can reverse-engineer approximate replacements for the original watermark keys, enabling subsequent watermark removal. In this paper, we explore black-box DNN watermarking specificity, which refers to the accuracy of a watermark's response to a key. Using this concept, we introduce Specificity-Enhanced Watermarking (SEW), a new method that improves specificity by reducing the association between the watermark and approximate keys. Through extensive evaluation using three popular watermarking benchmarks, we validate that enhancing specificity significantly contributes to strengthening robustness against removal attacks. SEW effectively defends against six state-of-the-art removal attacks, while maintaining model usability and watermark verification performance.
Related papers
- ChainMarks: Securing DNN Watermark with Cryptographic Chain [11.692176144467513]
Deep neural network (DNN) models are being used to protect the intellectual property of model owners.<n>Recent studies have shown that existing watermarking schemes are vulnerable to watermark removal and ambiguity attacks.<n>We propose ChainMarks, which generates secure and robust watermarks by introducing a cryptographic chain into the trigger inputs.
arXiv Detail & Related papers (2025-05-08T06:30:46Z) - Robust and Minimally Invasive Watermarking for EaaS [50.08021440235581]
Embeds as a Service (Eding) is emerging as a crucial role in AI applications.<n>Eding is vulnerable to model extraction attacks, highlighting the need for copyright protection.<n>We propose a novel embedding-specific watermarking (ESpeW) mechanism to offer robust copyright protection for Eding.
arXiv Detail & Related papers (2024-10-23T04:34:49Z) - DeepEclipse: How to Break White-Box DNN-Watermarking Schemes [60.472676088146436]
We present obfuscation techniques that significantly differ from the existing white-box watermarking removal schemes.
DeepEclipse can evade watermark detection without prior knowledge of the underlying watermarking scheme.
Our evaluation reveals that DeepEclipse excels in breaking multiple white-box watermarking schemes.
arXiv Detail & Related papers (2024-03-06T10:24:47Z) - Towards Robust Model Watermark via Reducing Parametric Vulnerability [57.66709830576457]
backdoor-based ownership verification becomes popular recently, in which the model owner can watermark the model.
We propose a mini-max formulation to find these watermark-removed models and recover their watermark behavior.
Our method improves the robustness of the model watermarking against parametric changes and numerous watermark-removal attacks.
arXiv Detail & Related papers (2023-09-09T12:46:08Z) - Neural Dehydration: Effective Erasure of Black-box Watermarks from DNNs with Limited Data [23.90041044463682]
We propose a watermark-agnostic removal attack called textscNeural Dehydration (textitabbrev. textscDehydra)
Our attack pipeline exploits the internals of the protected model to recover and unlearn the watermark message.
We achieve strong removal effectiveness across all the covered watermarks, preserving at least $90%$ of the stolen model utility.
arXiv Detail & Related papers (2023-09-07T03:16:03Z) - DICTION:DynamIC robusT whIte bOx watermarkiNg scheme for deep neural networks [2.8648861222787882]
Deep neural network (DNN) watermarking is a suitable method for protecting the ownership of deep learning (DL) models.<n>In this paper, we first provide a unified framework for white box DNN watermarking schemes.<n>Next, we introduce DICTION, a new white-box Dynamic Robust watermarking scheme.
arXiv Detail & Related papers (2022-10-27T19:48:26Z) - Certified Neural Network Watermarks with Randomized Smoothing [64.86178395240469]
We propose a certifiable watermarking method for deep learning models.
We show that our watermark is guaranteed to be unremovable unless the model parameters are changed by more than a certain l2 threshold.
Our watermark is also empirically more robust compared to previous watermarking methods.
arXiv Detail & Related papers (2022-07-16T16:06:59Z) - Exploring Structure Consistency for Deep Model Watermarking [122.38456787761497]
The intellectual property (IP) of Deep neural networks (DNNs) can be easily stolen'' by surrogate model attack.
We propose a new watermarking methodology, namely structure consistency'', based on which a new deep structure-aligned model watermarking algorithm is designed.
arXiv Detail & Related papers (2021-08-05T04:27:15Z) - Piracy-Resistant DNN Watermarking by Block-Wise Image Transformation
with Secret Key [15.483078145498085]
The proposed method embeds a watermark pattern in a model by using learnable transformed images.
It is piracy-resistant, so the original watermark cannot be overwritten by a pirated watermark.
The results show that it was resilient against fine-tuning and pruning attacks while maintaining a high watermark-detection accuracy.
arXiv Detail & Related papers (2021-04-09T08:21:53Z) - Fine-tuning Is Not Enough: A Simple yet Effective Watermark Removal
Attack for DNN Models [72.9364216776529]
We propose a novel watermark removal attack from a different perspective.
We design a simple yet powerful transformation algorithm by combining imperceptible pattern embedding and spatial-level transformations.
Our attack can bypass state-of-the-art watermarking solutions with very high success rates.
arXiv Detail & Related papers (2020-09-18T09:14:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.