On the Robustness of Dataset Inference
- URL: http://arxiv.org/abs/2210.13631v3
- Date: Mon, 19 Jun 2023 12:35:25 GMT
- Title: On the Robustness of Dataset Inference
- Authors: Sebastian Szyller, Rui Zhang, Jian Liu, N. Asokan
- Abstract summary: Machine learning (ML) models are costly to train as they can require a significant amount of data, computational resources and technical expertise.
Ownership verification techniques allow the victims of model stealing attacks to demonstrate that a suspect model was in fact stolen from theirs.
A fingerprinting technique, dataset inference (DI), has been shown to offer better robustness and efficiency than prior methods.
- Score: 21.321310557323383
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Machine learning (ML) models are costly to train as they can require a
significant amount of data, computational resources and technical expertise.
Thus, they constitute valuable intellectual property that needs protection from
adversaries wanting to steal them. Ownership verification techniques allow the
victims of model stealing attacks to demonstrate that a suspect model was in
fact stolen from theirs.
Although a number of ownership verification techniques based on watermarking
or fingerprinting have been proposed, most of them fall short either in terms
of security guarantees (well-equipped adversaries can evade verification) or
computational cost. A fingerprinting technique, Dataset Inference (DI), has
been shown to offer better robustness and efficiency than prior methods.
The authors of DI provided a correctness proof for linear (suspect) models.
However, in a subspace of the same setting, we prove that DI suffers from high
false positives (FPs) -- it can incorrectly identify an independent model
trained with non-overlapping data from the same distribution as stolen. We
further prove that DI also triggers FPs in realistic, non-linear suspect
models. We then confirm empirically that DI in the black-box setting leads to
FPs, with high confidence.
Second, we show that DI also suffers from false negatives (FNs) -- an
adversary can fool DI (at the cost of incurring some accuracy loss) by
regularising a stolen model's decision boundaries using adversarial training,
thereby leading to an FN. To this end, we demonstrate that black-box DI fails
to identify a model adversarially trained from a stolen dataset -- the setting
where DI is the hardest to evade.
Finally, we discuss the implications of our findings, the viability of
fingerprinting-based ownership verification in general, and suggest directions
for future work.
Related papers
- Data Taggants: Dataset Ownership Verification via Harmless Targeted Data Poisoning [12.80649024603656]
This paper introduces data taggants, a novel non-backdoor dataset ownership verification technique.
We validate our approach through comprehensive and realistic experiments on ImageNet1k using ViT and ResNet models with state-of-the-art training recipes.
arXiv Detail & Related papers (2024-10-09T12:49:23Z) - Isolation and Induction: Training Robust Deep Neural Networks against
Model Stealing Attacks [51.51023951695014]
Existing model stealing defenses add deceptive perturbations to the victim's posterior probabilities to mislead the attackers.
This paper proposes Isolation and Induction (InI), a novel and effective training framework for model stealing defenses.
In contrast to adding perturbations over model predictions that harm the benign accuracy, we train models to produce uninformative outputs against stealing queries.
arXiv Detail & Related papers (2023-08-02T05:54:01Z) - Avoid Adversarial Adaption in Federated Learning by Multi-Metric
Investigations [55.2480439325792]
Federated Learning (FL) facilitates decentralized machine learning model training, preserving data privacy, lowering communication costs, and boosting model performance through diversified data sources.
FL faces vulnerabilities such as poisoning attacks, undermining model integrity with both untargeted performance degradation and targeted backdoor attacks.
We define a new notion of strong adaptive adversaries, capable of adapting to multiple objectives simultaneously.
MESAS is the first defense robust against strong adaptive adversaries, effective in real-world data scenarios, with an average overhead of just 24.37 seconds.
arXiv Detail & Related papers (2023-06-06T11:44:42Z) - Dataset Inference for Self-Supervised Models [21.119579812529395]
Self-supervised models are increasingly prevalent in machine learning (ML)
They are vulnerable to model stealing attacks due to the high dimensionality of vector representations they output.
We introduce a new dataset inference defense, which uses the private training set of the victim encoder model to attribute its ownership in the event of stealing.
arXiv Detail & Related papers (2022-09-16T15:39:06Z) - Robust Transferable Feature Extractors: Learning to Defend Pre-Trained
Networks Against White Box Adversaries [69.53730499849023]
We show that adversarial examples can be successfully transferred to another independently trained model to induce prediction errors.
We propose a deep learning-based pre-processing mechanism, which we refer to as a robust transferable feature extractor (RTFE)
arXiv Detail & Related papers (2022-09-14T21:09:34Z) - Black-box Dataset Ownership Verification via Backdoor Watermarking [67.69308278379957]
We formulate the protection of released datasets as verifying whether they are adopted for training a (suspicious) third-party model.
We propose to embed external patterns via backdoor watermarking for the ownership verification to protect them.
Specifically, we exploit poison-only backdoor attacks ($e.g.$, BadNets) for dataset watermarking and design a hypothesis-test-guided method for dataset verification.
arXiv Detail & Related papers (2022-08-04T05:32:20Z) - MOVE: Effective and Harmless Ownership Verification via Embedded
External Features [109.19238806106426]
We propose an effective and harmless model ownership verification (MOVE) to defend against different types of model stealing simultaneously.
We conduct the ownership verification by verifying whether a suspicious model contains the knowledge of defender-specified external features.
In particular, we develop our MOVE method under both white-box and black-box settings to provide comprehensive model protection.
arXiv Detail & Related papers (2022-08-04T02:22:29Z) - Defending against Model Stealing via Verifying Embedded External
Features [90.29429679125508]
adversaries can steal' deployed models even when they have no training samples and can not get access to the model parameters or structures.
We explore the defense from another angle by verifying whether a suspicious model contains the knowledge of defender-specified emphexternal features.
Our method is effective in detecting different types of model stealing simultaneously, even if the stolen model is obtained via a multi-stage stealing process.
arXiv Detail & Related papers (2021-12-07T03:51:54Z) - Dataset Inference: Ownership Resolution in Machine Learning [18.248121977353506]
knowledge contained in stolen model's training set is what is common to all stolen copies.
We introduce $dataset$ $inference, the process of identifying whether a suspected model copy has private knowledge from the original model's dataset.
Experiments on CIFAR10, SVHN, CIFAR100 and ImageNet show that model owners can claim with confidence greater than 99% that their model (or dataset as a matter of fact) was stolen.
arXiv Detail & Related papers (2021-04-21T18:12:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.