Related papers: OVLA: Neural Network Ownership Verification using Latent Watermarks

OVLA: Neural Network Ownership Verification using Latent Watermarks

URL: http://arxiv.org/abs/2306.13215v2
Date: Mon, 26 Jun 2023 02:24:19 GMT
Title: OVLA: Neural Network Ownership Verification using Latent Watermarks
Authors: Feisi Fu, Wenchao Li
Abstract summary: We present a novel methodology for neural network ownership verification based on latent watermarks. We show that our approach offers strong defense against backdoor detection, backdoor removal and surrogate model attacks.
Score: 7.661766773170363
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Ownership verification for neural networks is important for protecting these models from illegal copying, free-riding, re-distribution and other intellectual property misuse. We present a novel methodology for neural network ownership verification based on the notion of latent watermarks. Existing ownership verification methods either modify or introduce constraints to the neural network parameters, which are accessible to an attacker in a white-box attack and can be harmful to the network's normal operation, or train the network to respond to specific watermarks in the inputs similar to data poisoning-based backdoor attacks, which are susceptible to backdoor removal techniques. In this paper, we address these problems by decoupling a network's normal operation from its responses to watermarked inputs during ownership verification. The key idea is to train the network such that the watermarks remain dormant unless the owner's secret key is applied to activate it. The secret key is realized as a specific perturbation only known to the owner to the network's parameters. We show that our approach offers strong defense against backdoor detection, backdoor removal and surrogate model attacks.In addition, our method provides protection against ambiguity attacks where the attacker either tries to guess the secret weight key or uses fine-tuning to embed their own watermarks with a different key into a pre-trained neural network. Experimental results demonstrate the advantages and effectiveness of our proposed approach.

Related papers

BlockDoor: Blocking Backdoor Based Watermarks in Deep Neural Networks [3.1858340237924776]
BlockDoor is a wrapper to block all three different kinds of Trigger samples, which are used in the literature as means to embed watermarks within the trained neural networks as backdoors. It is able to significantly reduce the watermark validation accuracy of the Trigger set by up to $98%$ without compromising on functionality.
arXiv Detail & Related papers (2024-12-14T06:38:01Z)
Safe and Robust Watermark Injection with a Single OoD Image [90.71804273115585]
Training a high-performance deep neural network requires large amounts of data and computational resources. We propose a safe and robust backdoor-based watermark injection technique. We induce random perturbation of model parameters during watermark injection to defend against common watermark removal attacks.
arXiv Detail & Related papers (2023-09-04T19:58:35Z)
FreeEagle: Detecting Complex Neural Trojans in Data-Free Cases [50.065022493142116]
Trojan attack on deep neural networks, also known as backdoor attack, is a typical threat to artificial intelligence. FreeEagle is the first data-free backdoor detection method that can effectively detect complex backdoor attacks.
arXiv Detail & Related papers (2023-02-28T11:31:29Z)
Untargeted Backdoor Attack against Object Detection [69.63097724439886]
We design a poison-only backdoor attack in an untargeted manner, based on task characteristics. We show that, once the backdoor is embedded into the target model by our attack, it can trick the model to lose detection of any object stamped with our trigger patterns.
arXiv Detail & Related papers (2022-11-02T17:05:45Z)
Untargeted Backdoor Watermark: Towards Harmless and Stealthy Dataset Copyright Protection [69.59980270078067]
We explore the untargeted backdoor watermarking scheme, where the abnormal model behaviors are not deterministic. We also discuss how to use the proposed untargeted backdoor watermark for dataset ownership verification.
arXiv Detail & Related papers (2022-09-27T12:56:56Z)
An anomaly detection approach for backdoored neural networks: face recognition as a case study [77.92020418343022]
We propose a novel backdoored network detection method based on the principle of anomaly detection. We test our method on a novel dataset of backdoored networks and report detectability results with perfect scores.
arXiv Detail & Related papers (2022-08-22T12:14:13Z)
Verifying Neural Networks Against Backdoor Attacks [7.5033553032683855]
We propose an approach to verify whether a given neural network is free of backdoor with a certain level of success rate. Experiment results show that our approach effectively verifies the absence of backdoor or generates backdoor triggers.
arXiv Detail & Related papers (2022-05-14T07:25:54Z)
Knowledge-Free Black-Box Watermark and Ownership Proof for Image Classification Neural Networks [9.117248639119529]
We propose a knowledge-free black-box watermarking scheme for image classification neural networks. A delicate encoding and verification protocol is designed to ensure the scheme's knowledgable security against adversaries. Experiment results proved the functionality-preserving capability and security of the proposed watermarking scheme.
arXiv Detail & Related papers (2022-04-09T18:09:02Z)
Reversible Watermarking in Deep Convolutional Neural Networks for Integrity Authentication [78.165255859254]
We propose a reversible watermarking algorithm for integrity authentication. The influence of embedding reversible watermarking on the classification performance is less than 0.5%. At the same time, the integrity of the model can be verified by applying the reversible watermarking.
arXiv Detail & Related papers (2021-04-09T09:32:21Z)
Neural Network Laundering: Removing Black-Box Backdoor Watermarks from Deep Neural Networks [17.720400846604907]
We propose a neural network "laundering" algorithm to remove black-box backdoor watermarks from neural networks. For all backdoor watermarking methods addressed in this paper, we find that the robustness of the watermark is significantly weaker than the original claims.
arXiv Detail & Related papers (2020-04-22T19:02:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.