MEA-Defender: A Robust Watermark against Model Extraction Attack
- URL: http://arxiv.org/abs/2401.15239v1
- Date: Fri, 26 Jan 2024 23:12:53 GMT
- Title: MEA-Defender: A Robust Watermark against Model Extraction Attack
- Authors: Peizhuo Lv, Hualong Ma, Kai Chen, Jiachen Zhou, Shengzhi Zhang,
Ruigang Liang, Shenchen Zhu, Pan Li, and Yingjun Zhang
- Abstract summary: We propose a novel watermark to protect IP of DNN models against model extraction, named MEA-Defender.
We conduct extensive experiments on four model extraction attacks, using five datasets and six models trained based on supervised learning and self-supervised learning algorithms.
The experimental results demonstrate that MEA-Defender is highly robust against different model extraction attacks, and various watermark removal/detection approaches.
- Score: 19.421741149364017
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently, numerous highly-valuable Deep Neural Networks (DNNs) have been
trained using deep learning algorithms. To protect the Intellectual Property
(IP) of the original owners over such DNN models, backdoor-based watermarks
have been extensively studied. However, most of such watermarks fail upon model
extraction attack, which utilizes input samples to query the target model and
obtains the corresponding outputs, thus training a substitute model using such
input-output pairs. In this paper, we propose a novel watermark to protect IP
of DNN models against model extraction, named MEA-Defender. In particular, we
obtain the watermark by combining two samples from two source classes in the
input domain and design a watermark loss function that makes the output domain
of the watermark within that of the main task samples. Since both the input
domain and the output domain of our watermark are indispensable parts of those
of the main task samples, the watermark will be extracted into the stolen model
along with the main task during model extraction. We conduct extensive
experiments on four model extraction attacks, using five datasets and six
models trained based on supervised learning and self-supervised learning
algorithms. The experimental results demonstrate that MEA-Defender is highly
robust against different model extraction attacks, and various watermark
removal/detection approaches.
Related papers
- ClearMark: Intuitive and Robust Model Watermarking via Transposed Model
Training [50.77001916246691]
This paper introduces ClearMark, the first DNN watermarking method designed for intuitive human assessment.
ClearMark embeds visible watermarks, enabling human decision-making without rigid value thresholds.
It shows an 8,544-bit watermark capacity comparable to the strongest existing work.
arXiv Detail & Related papers (2023-10-25T08:16:55Z) - Towards Robust Model Watermark via Reducing Parametric Vulnerability [57.66709830576457]
backdoor-based ownership verification becomes popular recently, in which the model owner can watermark the model.
We propose a mini-max formulation to find these watermark-removed models and recover their watermark behavior.
Our method improves the robustness of the model watermarking against parametric changes and numerous watermark-removal attacks.
arXiv Detail & Related papers (2023-09-09T12:46:08Z) - Safe and Robust Watermark Injection with a Single OoD Image [90.71804273115585]
Training a high-performance deep neural network requires large amounts of data and computational resources.
We propose a safe and robust backdoor-based watermark injection technique.
We induce random perturbation of model parameters during watermark injection to defend against common watermark removal attacks.
arXiv Detail & Related papers (2023-09-04T19:58:35Z) - On Function-Coupled Watermarks for Deep Neural Networks [15.478746926391146]
We propose a novel DNN watermarking solution that can effectively defend against watermark removal attacks.
Our key insight is to enhance the coupling of the watermark and model functionalities.
Results show a 100% watermark authentication success rate under aggressive watermark removal attacks.
arXiv Detail & Related papers (2023-02-08T05:55:16Z) - Seeds Don't Lie: An Adaptive Watermarking Framework for Computer Vision
Models [44.80560808267494]
We present an adaptive framework to watermark a protected model, leveraging the unique behavior present in the model.
This watermark is used to detect extracted models, which have the same unique behavior, indicating an unauthorized usage of the protected model's IP.
We show that the framework is robust to (1) unseen model extraction attacks, and (2) extracted models which undergo a method (e.g., weight pruning)
arXiv Detail & Related papers (2022-11-24T14:48:40Z) - DynaMarks: Defending Against Deep Learning Model Extraction Using
Dynamic Watermarking [3.282282297279473]
The functionality of a deep learning (DL) model can be stolen via model extraction.
We propose a novel watermarking technique called DynaMarks to protect the intellectual property (IP) of DL models.
arXiv Detail & Related papers (2022-07-27T06:49:39Z) - Exploring Structure Consistency for Deep Model Watermarking [122.38456787761497]
The intellectual property (IP) of Deep neural networks (DNNs) can be easily stolen'' by surrogate model attack.
We propose a new watermarking methodology, namely structure consistency'', based on which a new deep structure-aligned model watermarking algorithm is designed.
arXiv Detail & Related papers (2021-08-05T04:27:15Z) - Deep Model Intellectual Property Protection via Deep Watermarking [122.87871873450014]
Deep neural networks are exposed to serious IP infringement risks.
Given a target deep model, if the attacker knows its full information, it can be easily stolen by fine-tuning.
We propose a new model watermarking framework for protecting deep networks trained for low-level computer vision or image processing tasks.
arXiv Detail & Related papers (2021-03-08T18:58:21Z) - Model Watermarking for Image Processing Networks [120.918532981871]
How to protect the intellectual property of deep models is a very important but seriously under-researched problem.
We propose the first model watermarking framework for protecting image processing models.
arXiv Detail & Related papers (2020-02-25T18:36:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.