Related papers: DynaMarks: Defending Against Deep Learning Model Extraction Using Dynamic Watermarking

DynaMarks: Defending Against Deep Learning Model Extraction Using Dynamic Watermarking

URL: http://arxiv.org/abs/2207.13321v1
Date: Wed, 27 Jul 2022 06:49:39 GMT
Title: DynaMarks: Defending Against Deep Learning Model Extraction Using Dynamic Watermarking
Authors: Abhishek Chakraborty, Daniel Xing, Yuntao Liu, and Ankur Srivastava
Abstract summary: The functionality of a deep learning (DL) model can be stolen via model extraction. We propose a novel watermarking technique called DynaMarks to protect the intellectual property (IP) of DL models.
Score: 3.282282297279473
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The functionality of a deep learning (DL) model can be stolen via model extraction where an attacker obtains a surrogate model by utilizing the responses from a prediction API of the original model. In this work, we propose a novel watermarking technique called DynaMarks to protect the intellectual property (IP) of DL models against such model extraction attacks in a black-box setting. Unlike existing approaches, DynaMarks does not alter the training process of the original model but rather embeds watermark into a surrogate model by dynamically changing the output responses from the original model prediction API based on certain secret parameters at inference runtime. The experimental outcomes on Fashion MNIST, CIFAR-10, and ImageNet datasets demonstrate the efficacy of DynaMarks scheme to watermark surrogate models while preserving the accuracies of the original models deployed in edge devices. In addition, we also perform experiments to evaluate the robustness of DynaMarks against various watermark removal strategies, thus allowing a DL model owner to reliably prove model ownership.

Related papers

AGATE: Stealthy Black-box Watermarking for Multimodal Model Copyright Protection [26.066755429896926]
Methods select Out-of-Distribution (OoD) data as backdoor watermarks and retrain the original model for copyright protection. Existing methods are susceptible to malicious detection and forgery by adversaries, resulting in watermark evasion. We propose Model-underlineagnostic Black-box Backdoor Wunderlineatermarking Framework (AGATE) to address stealthiness and robustness challenges in multimodal model copyright protection.
arXiv Detail & Related papers (2025-04-28T14:52:01Z)
Tracking the Copyright of Large Vision-Language Models through Parameter Learning Adversarial Images [9.351260848685229]
Large vision-language models (LVLMs) have demonstrated remarkable image understanding and dialogue capabilities. Their widespread availability raises concerns about unauthorized usage and copyright infringement. We propose a novel method called Learning Attack (PLA) for tracking the copyright of LVLMs without modifying the original model.
arXiv Detail & Related papers (2025-02-23T14:49:34Z)
SleeperMark: Towards Robust Watermark against Fine-Tuning Text-to-image Diffusion Models [77.80595722480074]
SleeperMark is a framework designed to embed resilient watermarks into T2I diffusion models. It guides the model to disentangle the watermark information from the semantic concepts it learns. Our experiments demonstrate the effectiveness of SleeperMark across various types of diffusion models.
arXiv Detail & Related papers (2024-12-06T08:44:18Z)
WAPITI: A Watermark for Finetuned Open-Source LLMs [42.1087852764299]
WAPITI is a new method that transfers watermarking from base models to fine-tuned models through parameter integration. We show that our method can successfully inject watermarks and is highly compatible with fine-tuned models.
arXiv Detail & Related papers (2024-10-09T01:41:14Z)
ModelShield: Adaptive and Robust Watermark against Model Extraction Attack [58.46326901858431]
Large language models (LLMs) demonstrate general intelligence across a variety of machine learning tasks. adversaries can still utilize model extraction attacks to steal the model intelligence encoded in model generation. Watermarking technology offers a promising solution for defending against such attacks by embedding unique identifiers into the model-generated content.
arXiv Detail & Related papers (2024-05-03T06:41:48Z)
MEA-Defender: A Robust Watermark against Model Extraction Attack [19.421741149364017]
We propose a novel watermark to protect IP of DNN models against model extraction, named MEA-Defender. We conduct extensive experiments on four model extraction attacks, using five datasets and six models trained based on supervised learning and self-supervised learning algorithms. The experimental results demonstrate that MEA-Defender is highly robust against different model extraction attacks, and various watermark removal/detection approaches.
arXiv Detail & Related papers (2024-01-26T23:12:53Z)
Probabilistically Robust Watermarking of Neural Networks [4.332441337407564]
We introduce a novel trigger set-based watermarking approach that demonstrates resilience against functionality stealing attacks. Our approach does not require additional model training and can be applied to any model architecture.
arXiv Detail & Related papers (2024-01-16T10:32:13Z)
Towards Robust Model Watermark via Reducing Parametric Vulnerability [57.66709830576457]
backdoor-based ownership verification becomes popular recently, in which the model owner can watermark the model. We propose a mini-max formulation to find these watermark-removed models and recover their watermark behavior. Our method improves the robustness of the model watermarking against parametric changes and numerous watermark-removal attacks.
arXiv Detail & Related papers (2023-09-09T12:46:08Z)
Seeds Don't Lie: An Adaptive Watermarking Framework for Computer Vision Models [44.80560808267494]
We present an adaptive framework to watermark a protected model, leveraging the unique behavior present in the model. This watermark is used to detect extracted models, which have the same unique behavior, indicating an unauthorized usage of the protected model's IP. We show that the framework is robust to (1) unseen model extraction attacks, and (2) extracted models which undergo a method (e.g., weight pruning)
arXiv Detail & Related papers (2022-11-24T14:48:40Z)
Watermarking for Out-of-distribution Detection [76.20630986010114]
Out-of-distribution (OOD) detection aims to identify OOD data based on representations extracted from well-trained deep models. We propose a general methodology named watermarking in this paper. We learn a unified pattern that is superimposed onto features of original data, and the model's detection capability is largely boosted after watermarking.
arXiv Detail & Related papers (2022-10-27T06:12:32Z)
MOVE: Effective and Harmless Ownership Verification via Embedded External Features [109.19238806106426]
We propose an effective and harmless model ownership verification (MOVE) to defend against different types of model stealing simultaneously. We conduct the ownership verification by verifying whether a suspicious model contains the knowledge of defender-specified external features. In particular, we develop our MOVE method under both white-box and black-box settings to provide comprehensive model protection.
arXiv Detail & Related papers (2022-08-04T02:22:29Z)
Model Watermarking for Image Processing Networks [120.918532981871]
How to protect the intellectual property of deep models is a very important but seriously under-researched problem. We propose the first model watermarking framework for protecting image processing models.
arXiv Detail & Related papers (2020-02-25T18:36:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.