DynaMarks: Defending Against Deep Learning Model Extraction Using
Dynamic Watermarking
- URL: http://arxiv.org/abs/2207.13321v1
- Date: Wed, 27 Jul 2022 06:49:39 GMT
- Title: DynaMarks: Defending Against Deep Learning Model Extraction Using
Dynamic Watermarking
- Authors: Abhishek Chakraborty, Daniel Xing, Yuntao Liu, and Ankur Srivastava
- Abstract summary: The functionality of a deep learning (DL) model can be stolen via model extraction.
We propose a novel watermarking technique called DynaMarks to protect the intellectual property (IP) of DL models.
- Score: 3.282282297279473
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The functionality of a deep learning (DL) model can be stolen via model
extraction where an attacker obtains a surrogate model by utilizing the
responses from a prediction API of the original model. In this work, we propose
a novel watermarking technique called DynaMarks to protect the intellectual
property (IP) of DL models against such model extraction attacks in a black-box
setting. Unlike existing approaches, DynaMarks does not alter the training
process of the original model but rather embeds watermark into a surrogate
model by dynamically changing the output responses from the original model
prediction API based on certain secret parameters at inference runtime. The
experimental outcomes on Fashion MNIST, CIFAR-10, and ImageNet datasets
demonstrate the efficacy of DynaMarks scheme to watermark surrogate models
while preserving the accuracies of the original models deployed in edge
devices. In addition, we also perform experiments to evaluate the robustness of
DynaMarks against various watermark removal strategies, thus allowing a DL
model owner to reliably prove model ownership.
Related papers
- WAPITI: A Watermark for Finetuned Open-Source LLMs [42.1087852764299]
WAPITI is a new method that transfers watermarking from base models to fine-tuned models through parameter integration.
We show that our method can successfully inject watermarks and is highly compatible with fine-tuned models.
arXiv Detail & Related papers (2024-10-09T01:41:14Z) - ModelShield: Adaptive and Robust Watermark against Model Extraction Attack [58.46326901858431]
Large language models (LLMs) demonstrate general intelligence across a variety of machine learning tasks.
adversaries can still utilize model extraction attacks to steal the model intelligence encoded in model generation.
Watermarking technology offers a promising solution for defending against such attacks by embedding unique identifiers into the model-generated content.
arXiv Detail & Related papers (2024-05-03T06:41:48Z) - MEA-Defender: A Robust Watermark against Model Extraction Attack [19.421741149364017]
We propose a novel watermark to protect IP of DNN models against model extraction, named MEA-Defender.
We conduct extensive experiments on four model extraction attacks, using five datasets and six models trained based on supervised learning and self-supervised learning algorithms.
The experimental results demonstrate that MEA-Defender is highly robust against different model extraction attacks, and various watermark removal/detection approaches.
arXiv Detail & Related papers (2024-01-26T23:12:53Z) - Probabilistically Robust Watermarking of Neural Networks [4.332441337407564]
We introduce a novel trigger set-based watermarking approach that demonstrates resilience against functionality stealing attacks.
Our approach does not require additional model training and can be applied to any model architecture.
arXiv Detail & Related papers (2024-01-16T10:32:13Z) - Towards Robust Model Watermark via Reducing Parametric Vulnerability [57.66709830576457]
backdoor-based ownership verification becomes popular recently, in which the model owner can watermark the model.
We propose a mini-max formulation to find these watermark-removed models and recover their watermark behavior.
Our method improves the robustness of the model watermarking against parametric changes and numerous watermark-removal attacks.
arXiv Detail & Related papers (2023-09-09T12:46:08Z) - Seeds Don't Lie: An Adaptive Watermarking Framework for Computer Vision
Models [44.80560808267494]
We present an adaptive framework to watermark a protected model, leveraging the unique behavior present in the model.
This watermark is used to detect extracted models, which have the same unique behavior, indicating an unauthorized usage of the protected model's IP.
We show that the framework is robust to (1) unseen model extraction attacks, and (2) extracted models which undergo a method (e.g., weight pruning)
arXiv Detail & Related papers (2022-11-24T14:48:40Z) - Watermarking for Out-of-distribution Detection [76.20630986010114]
Out-of-distribution (OOD) detection aims to identify OOD data based on representations extracted from well-trained deep models.
We propose a general methodology named watermarking in this paper.
We learn a unified pattern that is superimposed onto features of original data, and the model's detection capability is largely boosted after watermarking.
arXiv Detail & Related papers (2022-10-27T06:12:32Z) - MOVE: Effective and Harmless Ownership Verification via Embedded
External Features [109.19238806106426]
We propose an effective and harmless model ownership verification (MOVE) to defend against different types of model stealing simultaneously.
We conduct the ownership verification by verifying whether a suspicious model contains the knowledge of defender-specified external features.
In particular, we develop our MOVE method under both white-box and black-box settings to provide comprehensive model protection.
arXiv Detail & Related papers (2022-08-04T02:22:29Z) - Model Watermarking for Image Processing Networks [120.918532981871]
How to protect the intellectual property of deep models is a very important but seriously under-researched problem.
We propose the first model watermarking framework for protecting image processing models.
arXiv Detail & Related papers (2020-02-25T18:36:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.