Related papers: Seeds Don't Lie: An Adaptive Watermarking Framework for Computer Vision Models

Seeds Don't Lie: An Adaptive Watermarking Framework for Computer Vision Models

URL: http://arxiv.org/abs/2211.13644v1
Date: Thu, 24 Nov 2022 14:48:40 GMT
Title: Seeds Don't Lie: An Adaptive Watermarking Framework for Computer Vision Models
Authors: Jacob Shams, Ben Nassi, Ikuya Morikawa, Toshiya Shimizu, Asaf Shabtai, Yuval Elovici
Abstract summary: We present an adaptive framework to watermark a protected model, leveraging the unique behavior present in the model. This watermark is used to detect extracted models, which have the same unique behavior, indicating an unauthorized usage of the protected model's IP. We show that the framework is robust to (1) unseen model extraction attacks, and (2) extracted models which undergo a method (e.g., weight pruning)
Score: 44.80560808267494
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In recent years, various watermarking methods were suggested to detect computer vision models obtained illegitimately from their owners, however they fail to demonstrate satisfactory robustness against model extraction attacks. In this paper, we present an adaptive framework to watermark a protected model, leveraging the unique behavior present in the model due to a unique random seed initialized during the model training. This watermark is used to detect extracted models, which have the same unique behavior, indicating an unauthorized usage of the protected model's intellectual property (IP). First, we show how an initial seed for random number generation as part of model training produces distinct characteristics in the model's decision boundaries, which are inherited by extracted models and present in their decision boundaries, but aren't present in non-extracted models trained on the same data-set with a different seed. Based on our findings, we suggest the Robust Adaptive Watermarking (RAW) Framework, which utilizes the unique behavior present in the protected and extracted models to generate a watermark key-set and verification model. We show that the framework is robust to (1) unseen model extraction attacks, and (2) extracted models which undergo a blurring method (e.g., weight pruning). We evaluate the framework's robustness against a naive attacker (unaware that the model is watermarked), and an informed attacker (who employs blurring strategies to remove watermarked behavior from an extracted model), and achieve outstanding (i.e., >0.9) AUC values. Finally, we show that the framework is robust to model extraction attacks with different structure and/or architecture than the protected model.

Related papers

AGATE: Stealthy Black-box Watermarking for Multimodal Model Copyright Protection [26.066755429896926]
Methods select Out-of-Distribution (OoD) data as backdoor watermarks and retrain the original model for copyright protection. Existing methods are susceptible to malicious detection and forgery by adversaries, resulting in watermark evasion. We propose Model-underlineagnostic Black-box Backdoor Wunderlineatermarking Framework (AGATE) to address stealthiness and robustness challenges in multimodal model copyright protection.
arXiv Detail & Related papers (2025-04-28T14:52:01Z)
Neural Honeytrace: A Robust Plug-and-Play Watermarking Framework against Model Extraction Attacks [39.06642008591216]
We propose Neural Honeytrace, a robust plug-and-play watermarking framework against model extraction attacks. Neural Honeytrace reduces the average number of samples required for a worst-case t-Test-based copyright claim from $12,000$ to $200$ with zero training cost.
arXiv Detail & Related papers (2025-01-16T06:59:20Z)
SleeperMark: Towards Robust Watermark against Fine-Tuning Text-to-image Diffusion Models [77.80595722480074]
SleeperMark is a framework designed to embed resilient watermarks into T2I diffusion models. It guides the model to disentangle the watermark information from the semantic concepts it learns. Our experiments demonstrate the effectiveness of SleeperMark across various types of diffusion models.
arXiv Detail & Related papers (2024-12-06T08:44:18Z)
Training Data Attribution: Was Your Model Secretly Trained On Data Created By Mine? [17.714589429503675]
We propose an injection-free training data attribution method for text-to-image models. Our approach involves developing algorithms to uncover distinct samples and using them as inherent watermarks. Our experiments demonstrate that our method achieves an accuracy of over 80% in identifying the source of a suspicious model's training data.
arXiv Detail & Related papers (2024-09-24T06:23:43Z)
ModelShield: Adaptive and Robust Watermark against Model Extraction Attack [58.46326901858431]
Large language models (LLMs) demonstrate general intelligence across a variety of machine learning tasks. adversaries can still utilize model extraction attacks to steal the model intelligence encoded in model generation. Watermarking technology offers a promising solution for defending against such attacks by embedding unique identifiers into the model-generated content.
arXiv Detail & Related papers (2024-05-03T06:41:48Z)
MEA-Defender: A Robust Watermark against Model Extraction Attack [19.421741149364017]
We propose a novel watermark to protect IP of DNN models against model extraction, named MEA-Defender. We conduct extensive experiments on four model extraction attacks, using five datasets and six models trained based on supervised learning and self-supervised learning algorithms. The experimental results demonstrate that MEA-Defender is highly robust against different model extraction attacks, and various watermark removal/detection approaches.
arXiv Detail & Related papers (2024-01-26T23:12:53Z)
Probabilistically Robust Watermarking of Neural Networks [4.332441337407564]
We introduce a novel trigger set-based watermarking approach that demonstrates resilience against functionality stealing attacks. Our approach does not require additional model training and can be applied to any model architecture.
arXiv Detail & Related papers (2024-01-16T10:32:13Z)
Watermarking for Out-of-distribution Detection [76.20630986010114]
Out-of-distribution (OOD) detection aims to identify OOD data based on representations extracted from well-trained deep models. We propose a general methodology named watermarking in this paper. We learn a unified pattern that is superimposed onto features of original data, and the model's detection capability is largely boosted after watermarking.
arXiv Detail & Related papers (2022-10-27T06:12:32Z)
Are You Stealing My Model? Sample Correlation for Fingerprinting Deep Neural Networks [86.55317144826179]
Previous methods always leverage the transferable adversarial examples as the model fingerprint. We propose a novel yet simple model stealing detection method based on SAmple Correlation (SAC) SAC successfully defends against various model stealing attacks, even including adversarial training or transfer learning.
arXiv Detail & Related papers (2022-10-21T02:07:50Z)
DynaMarks: Defending Against Deep Learning Model Extraction Using Dynamic Watermarking [3.282282297279473]
The functionality of a deep learning (DL) model can be stolen via model extraction. We propose a novel watermarking technique called DynaMarks to protect the intellectual property (IP) of DL models.
arXiv Detail & Related papers (2022-07-27T06:49:39Z)
Defending against Model Stealing via Verifying Embedded External Features [90.29429679125508]
adversaries can steal' deployed models even when they have no training samples and can not get access to the model parameters or structures. We explore the defense from another angle by verifying whether a suspicious model contains the knowledge of defender-specified emphexternal features. Our method is effective in detecting different types of model stealing simultaneously, even if the stolen model is obtained via a multi-stage stealing process.
arXiv Detail & Related papers (2021-12-07T03:51:54Z)
Model Watermarking for Image Processing Networks [120.918532981871]
How to protect the intellectual property of deep models is a very important but seriously under-researched problem. We propose the first model watermarking framework for protecting image processing models.
arXiv Detail & Related papers (2020-02-25T18:36:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.