Related papers: Hashed Watermark as a Filter: Defeating Forging and Overwriting Attacks in Weight-based Neural Network Watermarking

Hashed Watermark as a Filter: Defeating Forging and Overwriting Attacks in Weight-based Neural Network Watermarking

URL: http://arxiv.org/abs/2507.11137v1
Date: Tue, 15 Jul 2025 09:38:11 GMT
Title: Hashed Watermark as a Filter: Defeating Forging and Overwriting Attacks in Weight-based Neural Network Watermarking
Authors: Yuan Yao, Jin Song, Jian Jin,
Abstract summary: NeuralMark is a robust method built around a hashed watermark filter.<n>It provides a robust defense against both forging and overwriting attacks.<n>It can be seamlessly integrated into various neural network architectures.
Score: 8.475609885922006
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: As valuable digital assets, deep neural networks necessitate robust ownership protection, positioning neural network watermarking (NNW) as a promising solution. Among various NNW approaches, weight-based methods are favored for their simplicity and practicality; however, they remain vulnerable to forging and overwriting attacks. To address those challenges, we propose NeuralMark, a robust method built around a hashed watermark filter. Specifically, we utilize a hash function to generate an irreversible binary watermark from a secret key, which is then used as a filter to select the model parameters for embedding. This design cleverly intertwines the embedding parameters with the hashed watermark, providing a robust defense against both forging and overwriting attacks. An average pooling is also incorporated to resist fine-tuning and pruning attacks. Furthermore, it can be seamlessly integrated into various neural network architectures, ensuring broad applicability. Theoretically, we analyze its security boundary. Empirically, we verify its effectiveness and robustness across 13 distinct Convolutional and Transformer architectures, covering five image classification tasks and one text generation task. The source codes are available at https://github.com/AIResearch-Group/NeuralMark.

Related papers

BlockDoor: Blocking Backdoor Based Watermarks in Deep Neural Networks [3.1858340237924776]
BlockDoor is a wrapper to block all three different kinds of Trigger samples, which are used in the literature as means to embed watermarks within the trained neural networks as backdoors.<n>It is able to significantly reduce the watermark validation accuracy of the Trigger set by up to $98%$ without compromising on functionality.
arXiv Detail & Related papers (2024-12-14T06:38:01Z)
DeepiSign-G: Generic Watermark to Stamp Hidden DNN Parameters for Self-contained Tracking [15.394110881491773]
DeepiSign-G is a versatile watermarking approach designed for comprehensive verification of leading DNN architectures, including CNNs and RNNs. Unlike traditional hashing techniques, DeepiSign-G allows substantial metadata incorporation directly within the model, enabling detailed, self-contained tracking and verification. We demonstrate DeepiSign-G's applicability across various architectures, including CNN models (VGG, ResNets, DenseNet) and RNNs (Text sentiment classifiers)
arXiv Detail & Related papers (2024-07-01T13:15:38Z)
An Unforgeable Publicly Verifiable Watermark for Large Language Models [84.2805275589553]
Current watermark detection algorithms require the secret key used in the watermark generation process, making them susceptible to security breaches and counterfeiting during public detection. We propose an unforgeable publicly verifiable watermark algorithm named UPV that uses two different neural networks for watermark generation and detection, instead of using the same key at both stages.
arXiv Detail & Related papers (2023-07-30T13:43:27Z)
Reversible Quantization Index Modulation for Static Deep Neural Network Watermarking [57.96787187733302]
Reversible data hiding (RDH) methods offer a potential solution, but existing approaches suffer from weaknesses in terms of usability, capacity, and fidelity. We propose a novel RDH-based static DNN watermarking scheme using quantization index modulation (QIM) Our scheme incorporates a novel approach based on a one-dimensional quantizer for watermark embedding.
arXiv Detail & Related papers (2023-05-29T04:39:17Z)
On Function-Coupled Watermarks for Deep Neural Networks [15.478746926391146]
We propose a novel DNN watermarking solution that can effectively defend against watermark removal attacks. Our key insight is to enhance the coupling of the watermark and model functionalities. Results show a 100% watermark authentication success rate under aggressive watermark removal attacks.
arXiv Detail & Related papers (2023-02-08T05:55:16Z)
Exploring Structure Consistency for Deep Model Watermarking [122.38456787761497]
The intellectual property (IP) of Deep neural networks (DNNs) can be easily stolen'' by surrogate model attack. We propose a new watermarking methodology, namely structure consistency'', based on which a new deep structure-aligned model watermarking algorithm is designed.
arXiv Detail & Related papers (2021-08-05T04:27:15Z)
Discriminator-Free Generative Adversarial Attack [87.71852388383242]
Agenerative-based adversarial attacks can get rid of this limitation. ASymmetric Saliency-based Auto-Encoder (SSAE) generates the perturbations. The adversarial examples generated by SSAE not only make thewidely-used models collapse, but also achieves good visual quality.
arXiv Detail & Related papers (2021-07-20T01:55:21Z)
Reversible Watermarking in Deep Convolutional Neural Networks for Integrity Authentication [78.165255859254]
We propose a reversible watermarking algorithm for integrity authentication. The influence of embedding reversible watermarking on the classification performance is less than 0.5%. At the same time, the integrity of the model can be verified by applying the reversible watermarking.
arXiv Detail & Related papers (2021-04-09T09:32:21Z)
Deep Model Intellectual Property Protection via Deep Watermarking [122.87871873450014]
Deep neural networks are exposed to serious IP infringement risks. Given a target deep model, if the attacker knows its full information, it can be easily stolen by fine-tuning. We propose a new model watermarking framework for protecting deep networks trained for low-level computer vision or image processing tasks.
arXiv Detail & Related papers (2021-03-08T18:58:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.