SSLGuard: A Watermarking Scheme for Self-supervised Learning Pre-trained
Encoders
- URL: http://arxiv.org/abs/2201.11692v1
- Date: Thu, 27 Jan 2022 17:41:54 GMT
- Title: SSLGuard: A Watermarking Scheme for Self-supervised Learning Pre-trained
Encoders
- Authors: Tianshuo Cong and Xinlei He and Yang Zhang
- Abstract summary: We propose SSLGuard, the first watermarking algorithm for pre-trained encoders.
SSLGuard is effective in watermark injection and verification, and is robust against model stealing and other watermark removal attacks.
- Score: 9.070481370120905
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Self-supervised learning is an emerging machine learning (ML) paradigm.
Compared to supervised learning that leverages high-quality labeled datasets to
achieve good performance, self-supervised learning relies on unlabeled datasets
to pre-train powerful encoders which can then be treated as feature extractors
for various downstream tasks. The huge amount of data and computational
resources consumption makes the encoders themselves become a valuable
intellectual property of the model owner. Recent research has shown that the ML
model's copyright is threatened by model stealing attacks, which aims to train
a surrogate model to mimic the behavior of a given model. We empirically show
that pre-trained encoders are highly vulnerable to model stealing attacks.
However, most of the current efforts of copyright protection algorithms such as
fingerprinting and watermarking concentrate on classifiers. Meanwhile, the
intrinsic challenges of pre-trained encoder's copyright protection remain
largely unstudied. We fill the gap by proposing SSLGuard, the first
watermarking algorithm for pre-trained encoders. Given a clean pre-trained
encoder, SSLGuard embeds a watermark into it and outputs a watermarked version.
The shadow training technique is also applied to preserve the watermark under
potential model stealing attacks. Our extensive evaluation shows that SSLGuard
is effective in watermark injection and verification, and is robust against
model stealing and other watermark removal attacks such as pruning and
finetuning.
Related papers
- Transferable Watermarking to Self-supervised Pre-trained Graph Encoders by Trigger Embeddings [43.067822791795095]
Graph Self-supervised Learning (GSSL) enables to pre-train foundation graph encoders.
Easy-to-plug-in nature of such encoders makes them vulnerable to copyright infringement.
We develop a novel watermarking framework to protect graph encoders in GSSL settings.
arXiv Detail & Related papers (2024-06-19T03:16:11Z) - Towards Robust Model Watermark via Reducing Parametric Vulnerability [57.66709830576457]
backdoor-based ownership verification becomes popular recently, in which the model owner can watermark the model.
We propose a mini-max formulation to find these watermark-removed models and recover their watermark behavior.
Our method improves the robustness of the model watermarking against parametric changes and numerous watermark-removal attacks.
arXiv Detail & Related papers (2023-09-09T12:46:08Z) - Downstream-agnostic Adversarial Examples [66.8606539786026]
AdvEncoder is first framework for generating downstream-agnostic universal adversarial examples based on pre-trained encoder.
Unlike traditional adversarial example works, the pre-trained encoder only outputs feature vectors rather than classification labels.
Our results show that an attacker can successfully attack downstream tasks without knowing either the pre-training dataset or the downstream dataset.
arXiv Detail & Related papers (2023-07-23T10:16:47Z) - Did You Train on My Dataset? Towards Public Dataset Protection with
Clean-Label Backdoor Watermarking [54.40184736491652]
We propose a backdoor-based watermarking approach that serves as a general framework for safeguarding public-available data.
By inserting a small number of watermarking samples into the dataset, our approach enables the learning model to implicitly learn a secret function set by defenders.
This hidden function can then be used as a watermark to track down third-party models that use the dataset illegally.
arXiv Detail & Related papers (2023-03-20T21:54:30Z) - Dataset Inference for Self-Supervised Models [21.119579812529395]
Self-supervised models are increasingly prevalent in machine learning (ML)
They are vulnerable to model stealing attacks due to the high dimensionality of vector representations they output.
We introduce a new dataset inference defense, which uses the private training set of the victim encoder model to attribute its ownership in the event of stealing.
arXiv Detail & Related papers (2022-09-16T15:39:06Z) - Watermarking Pre-trained Encoders in Contrastive Learning [9.23485246108653]
The pre-trained encoders are an important intellectual property that needs to be carefully protected.
It is challenging to migrate existing watermarking techniques from the classification tasks to the contrastive learning scenario.
We introduce a task-agnostic loss function to effectively embed into the encoder a backdoor as the watermark.
arXiv Detail & Related papers (2022-01-20T15:14:31Z) - Can't Steal? Cont-Steal! Contrastive Stealing Attacks Against Image
Encoders [23.2869445054295]
Self-supervised representation learning techniques encode images into rich features that are oblivious to downstream tasks.
The requirements for dedicated model designs and a massive amount of resources expose image encoders to the risks of potential model stealing attacks.
We propose Cont-Steal, a contrastive-learning-based attack, and validate its improved stealing effectiveness in various experiment settings.
arXiv Detail & Related papers (2022-01-19T10:27:28Z) - Deep Model Intellectual Property Protection via Deep Watermarking [122.87871873450014]
Deep neural networks are exposed to serious IP infringement risks.
Given a target deep model, if the attacker knows its full information, it can be easily stolen by fine-tuning.
We propose a new model watermarking framework for protecting deep networks trained for low-level computer vision or image processing tasks.
arXiv Detail & Related papers (2021-03-08T18:58:21Z) - Don't Forget to Sign the Gradients! [60.98885980669777]
GradSigns is a novel watermarking framework for deep neural networks (DNNs)
We present GradSigns, a novel watermarking framework for deep neural networks (DNNs)
arXiv Detail & Related papers (2021-03-05T14:24:32Z) - Model Watermarking for Image Processing Networks [120.918532981871]
How to protect the intellectual property of deep models is a very important but seriously under-researched problem.
We propose the first model watermarking framework for protecting image processing models.
arXiv Detail & Related papers (2020-02-25T18:36:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.