Seeds Don't Lie: An Adaptive Watermarking Framework for Computer Vision
Models
- URL: http://arxiv.org/abs/2211.13644v1
- Date: Thu, 24 Nov 2022 14:48:40 GMT
- Title: Seeds Don't Lie: An Adaptive Watermarking Framework for Computer Vision
Models
- Authors: Jacob Shams, Ben Nassi, Ikuya Morikawa, Toshiya Shimizu, Asaf Shabtai,
Yuval Elovici
- Abstract summary: We present an adaptive framework to watermark a protected model, leveraging the unique behavior present in the model.
This watermark is used to detect extracted models, which have the same unique behavior, indicating an unauthorized usage of the protected model's IP.
We show that the framework is robust to (1) unseen model extraction attacks, and (2) extracted models which undergo a method (e.g., weight pruning)
- Score: 44.80560808267494
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In recent years, various watermarking methods were suggested to detect
computer vision models obtained illegitimately from their owners, however they
fail to demonstrate satisfactory robustness against model extraction attacks.
In this paper, we present an adaptive framework to watermark a protected model,
leveraging the unique behavior present in the model due to a unique random seed
initialized during the model training. This watermark is used to detect
extracted models, which have the same unique behavior, indicating an
unauthorized usage of the protected model's intellectual property (IP). First,
we show how an initial seed for random number generation as part of model
training produces distinct characteristics in the model's decision boundaries,
which are inherited by extracted models and present in their decision
boundaries, but aren't present in non-extracted models trained on the same
data-set with a different seed. Based on our findings, we suggest the Robust
Adaptive Watermarking (RAW) Framework, which utilizes the unique behavior
present in the protected and extracted models to generate a watermark key-set
and verification model. We show that the framework is robust to (1) unseen
model extraction attacks, and (2) extracted models which undergo a blurring
method (e.g., weight pruning). We evaluate the framework's robustness against a
naive attacker (unaware that the model is watermarked), and an informed
attacker (who employs blurring strategies to remove watermarked behavior from
an extracted model), and achieve outstanding (i.e., >0.9) AUC values. Finally,
we show that the framework is robust to model extraction attacks with different
structure and/or architecture than the protected model.
Related papers
- Training Data Attribution: Was Your Model Secretly Trained On Data Created By Mine? [17.714589429503675]
We propose an injection-free training data attribution method for text-to-image models.
Our approach involves developing algorithms to uncover distinct samples and using them as inherent watermarks.
Our experiments demonstrate that our method achieves an accuracy of over 80% in identifying the source of a suspicious model's training data.
arXiv Detail & Related papers (2024-09-24T06:23:43Z) - ModelShield: Adaptive and Robust Watermark against Model Extraction Attack [58.46326901858431]
Large language models (LLMs) demonstrate general intelligence across a variety of machine learning tasks.
adversaries can still utilize model extraction attacks to steal the model intelligence encoded in model generation.
Watermarking technology offers a promising solution for defending against such attacks by embedding unique identifiers into the model-generated content.
arXiv Detail & Related papers (2024-05-03T06:41:48Z) - MEA-Defender: A Robust Watermark against Model Extraction Attack [19.421741149364017]
We propose a novel watermark to protect IP of DNN models against model extraction, named MEA-Defender.
We conduct extensive experiments on four model extraction attacks, using five datasets and six models trained based on supervised learning and self-supervised learning algorithms.
The experimental results demonstrate that MEA-Defender is highly robust against different model extraction attacks, and various watermark removal/detection approaches.
arXiv Detail & Related papers (2024-01-26T23:12:53Z) - Probabilistically Robust Watermarking of Neural Networks [4.332441337407564]
We introduce a novel trigger set-based watermarking approach that demonstrates resilience against functionality stealing attacks.
Our approach does not require additional model training and can be applied to any model architecture.
arXiv Detail & Related papers (2024-01-16T10:32:13Z) - Watermarking for Out-of-distribution Detection [76.20630986010114]
Out-of-distribution (OOD) detection aims to identify OOD data based on representations extracted from well-trained deep models.
We propose a general methodology named watermarking in this paper.
We learn a unified pattern that is superimposed onto features of original data, and the model's detection capability is largely boosted after watermarking.
arXiv Detail & Related papers (2022-10-27T06:12:32Z) - Are You Stealing My Model? Sample Correlation for Fingerprinting Deep
Neural Networks [86.55317144826179]
Previous methods always leverage the transferable adversarial examples as the model fingerprint.
We propose a novel yet simple model stealing detection method based on SAmple Correlation (SAC)
SAC successfully defends against various model stealing attacks, even including adversarial training or transfer learning.
arXiv Detail & Related papers (2022-10-21T02:07:50Z) - DynaMarks: Defending Against Deep Learning Model Extraction Using
Dynamic Watermarking [3.282282297279473]
The functionality of a deep learning (DL) model can be stolen via model extraction.
We propose a novel watermarking technique called DynaMarks to protect the intellectual property (IP) of DL models.
arXiv Detail & Related papers (2022-07-27T06:49:39Z) - Defending against Model Stealing via Verifying Embedded External
Features [90.29429679125508]
adversaries can steal' deployed models even when they have no training samples and can not get access to the model parameters or structures.
We explore the defense from another angle by verifying whether a suspicious model contains the knowledge of defender-specified emphexternal features.
Our method is effective in detecting different types of model stealing simultaneously, even if the stolen model is obtained via a multi-stage stealing process.
arXiv Detail & Related papers (2021-12-07T03:51:54Z) - Model Watermarking for Image Processing Networks [120.918532981871]
How to protect the intellectual property of deep models is a very important but seriously under-researched problem.
We propose the first model watermarking framework for protecting image processing models.
arXiv Detail & Related papers (2020-02-25T18:36:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.