Probabilistically Robust Watermarking of Neural Networks
- URL: http://arxiv.org/abs/2401.08261v1
- Date: Tue, 16 Jan 2024 10:32:13 GMT
- Title: Probabilistically Robust Watermarking of Neural Networks
- Authors: Mikhail Pautov, Nikita Bogdanov, Stanislav Pyatkin, Oleg Rogov, Ivan
Oseledets
- Abstract summary: We introduce a novel trigger set-based watermarking approach that demonstrates resilience against functionality stealing attacks.
Our approach does not require additional model training and can be applied to any model architecture.
- Score: 4.64804216027185
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: As deep learning (DL) models are widely and effectively used in Machine
Learning as a Service (MLaaS) platforms, there is a rapidly growing interest in
DL watermarking techniques that can be used to confirm the ownership of a
particular model. Unfortunately, these methods usually produce watermarks
susceptible to model stealing attacks. In our research, we introduce a novel
trigger set-based watermarking approach that demonstrates resilience against
functionality stealing attacks, particularly those involving extraction and
distillation. Our approach does not require additional model training and can
be applied to any model architecture. The key idea of our method is to compute
the trigger set, which is transferable between the source model and the set of
proxy models with a high probability. In our experimental study, we show that
if the probability of the set being transferable is reasonably high, it can be
effectively used for ownership verification of the stolen model. We evaluate
our method on multiple benchmarks and show that our approach outperforms
current state-of-the-art watermarking techniques in all considered experimental
setups.
Related papers
- Not Just Change the Labels, Learn the Features: Watermarking Deep Neural Networks with Multi-View Data [10.564634073196117]
We introduce a novel watermarking technique based on Multi-view dATa, called MAT, for efficiently embedding watermarks within DNNs.
We validate our method across various benchmarks and demonstrate its efficacy in defending against model extraction attacks.
arXiv Detail & Related papers (2024-03-15T20:12:41Z) - Towards Robust Model Watermark via Reducing Parametric Vulnerability [57.66709830576457]
backdoor-based ownership verification becomes popular recently, in which the model owner can watermark the model.
We propose a mini-max formulation to find these watermark-removed models and recover their watermark behavior.
Our method improves the robustness of the model watermarking against parametric changes and numerous watermark-removal attacks.
arXiv Detail & Related papers (2023-09-09T12:46:08Z) - Trojan Model Detection Using Activation Optimization [15.032071953322594]
Training machine learning models can be very expensive or even unaffordable.
Pre-trained models can be infected with Trojan attacks.
We present a novel method for detecting Trojan models.
arXiv Detail & Related papers (2023-06-08T02:17:29Z) - Seeds Don't Lie: An Adaptive Watermarking Framework for Computer Vision
Models [44.80560808267494]
We present an adaptive framework to watermark a protected model, leveraging the unique behavior present in the model.
This watermark is used to detect extracted models, which have the same unique behavior, indicating an unauthorized usage of the protected model's IP.
We show that the framework is robust to (1) unseen model extraction attacks, and (2) extracted models which undergo a method (e.g., weight pruning)
arXiv Detail & Related papers (2022-11-24T14:48:40Z) - Watermarking for Out-of-distribution Detection [76.20630986010114]
Out-of-distribution (OOD) detection aims to identify OOD data based on representations extracted from well-trained deep models.
We propose a general methodology named watermarking in this paper.
We learn a unified pattern that is superimposed onto features of original data, and the model's detection capability is largely boosted after watermarking.
arXiv Detail & Related papers (2022-10-27T06:12:32Z) - DynaMarks: Defending Against Deep Learning Model Extraction Using
Dynamic Watermarking [3.282282297279473]
The functionality of a deep learning (DL) model can be stolen via model extraction.
We propose a novel watermarking technique called DynaMarks to protect the intellectual property (IP) of DL models.
arXiv Detail & Related papers (2022-07-27T06:49:39Z) - Defending against Model Stealing via Verifying Embedded External
Features [90.29429679125508]
adversaries can steal' deployed models even when they have no training samples and can not get access to the model parameters or structures.
We explore the defense from another angle by verifying whether a suspicious model contains the knowledge of defender-specified emphexternal features.
Our method is effective in detecting different types of model stealing simultaneously, even if the stolen model is obtained via a multi-stage stealing process.
arXiv Detail & Related papers (2021-12-07T03:51:54Z) - Deep Model Intellectual Property Protection via Deep Watermarking [122.87871873450014]
Deep neural networks are exposed to serious IP infringement risks.
Given a target deep model, if the attacker knows its full information, it can be easily stolen by fine-tuning.
We propose a new model watermarking framework for protecting deep networks trained for low-level computer vision or image processing tasks.
arXiv Detail & Related papers (2021-03-08T18:58:21Z) - Don't Forget to Sign the Gradients! [60.98885980669777]
GradSigns is a novel watermarking framework for deep neural networks (DNNs)
We present GradSigns, a novel watermarking framework for deep neural networks (DNNs)
arXiv Detail & Related papers (2021-03-05T14:24:32Z) - Model Watermarking for Image Processing Networks [120.918532981871]
How to protect the intellectual property of deep models is a very important but seriously under-researched problem.
We propose the first model watermarking framework for protecting image processing models.
arXiv Detail & Related papers (2020-02-25T18:36:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.