Fingerprinting Deep Neural Networks Globally via Universal Adversarial
Perturbations
- URL: http://arxiv.org/abs/2202.08602v1
- Date: Thu, 17 Feb 2022 11:29:50 GMT
- Title: Fingerprinting Deep Neural Networks Globally via Universal Adversarial
Perturbations
- Authors: Zirui Peng and Shaofeng Li and Guoxing Chen and Cheng Zhang and Haojin
Zhu and Minhui Xue
- Abstract summary: We propose a novel and practical mechanism which enables the service provider to verify whether a suspect model is stolen from the victim model.
Our framework can detect model IP breaches with confidence 99.99 %$ within only $20$ fingerprints of the suspect model.
- Score: 22.89321897726347
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we propose a novel and practical mechanism which enables the
service provider to verify whether a suspect model is stolen from the victim
model via model extraction attacks. Our key insight is that the profile of a
DNN model's decision boundary can be uniquely characterized by its
\textit{Universal Adversarial Perturbations (UAPs)}. UAPs belong to a
low-dimensional subspace and piracy models' subspaces are more consistent with
victim model's subspace compared with non-piracy model. Based on this, we
propose a UAP fingerprinting method for DNN models and train an encoder via
\textit{contrastive learning} that takes fingerprint as inputs, outputs a
similarity score. Extensive studies show that our framework can detect model IP
breaches with confidence $> 99.99 \%$ within only $20$ fingerprints of the
suspect model. It has good generalizability across different model
architectures and is robust against post-modifications on stolen models.
Related papers
- Model Pairing Using Embedding Translation for Backdoor Attack Detection on Open-Set Classification Tasks [63.269788236474234]
We propose to use model pairs on open-set classification tasks for detecting backdoors.
We show that this score, can be an indicator for the presence of a backdoor despite models being of different architectures.
This technique allows for the detection of backdoors on models designed for open-set classification tasks, which is little studied in the literature.
arXiv Detail & Related papers (2024-02-28T21:29:16Z) - Who Leaked the Model? Tracking IP Infringers in Accountable Federated Learning [51.26221422507554]
Federated learning (FL) is an effective collaborative learning framework to coordinate data and computation resources from massive and distributed clients in training.
Such collaboration results in non-trivial intellectual property (IP) represented by the model parameters that should be protected and shared by the whole party rather than an individual user.
To block such IP leakage, it is essential to make the IP identifiable in the shared model and locate the anonymous infringer who first leaks it.
We propose Decodable Unique Watermarking (DUW) for complying with the requirements of accountable FL.
arXiv Detail & Related papers (2023-12-06T00:47:55Z) - NaturalFinger: Generating Natural Fingerprint with Generative
Adversarial Networks [4.536351805614037]
We propose NaturalFinger which generates natural fingerprint with generative adversarial networks (GANs)
Our approach achieves 0.91 ARUC value on the FingerBench dataset (154 models), exceeding the optimal baseline (MetaV) over 17%.
arXiv Detail & Related papers (2023-05-29T03:17:03Z) - Are You Stealing My Model? Sample Correlation for Fingerprinting Deep
Neural Networks [86.55317144826179]
Previous methods always leverage the transferable adversarial examples as the model fingerprint.
We propose a novel yet simple model stealing detection method based on SAmple Correlation (SAC)
SAC successfully defends against various model stealing attacks, even including adversarial training or transfer learning.
arXiv Detail & Related papers (2022-10-21T02:07:50Z) - InFIP: An Explainable DNN Intellectual Property Protection Method based
on Intrinsic Features [12.037142903022891]
We propose an interpretable intellectual property protection method for Deep Neural Networks (DNNs) based on explainable artificial intelligence.
The proposed method does not modify the DNN model, and the decision of the ownership verification is interpretable.
Experimental results demonstrate that the fingerprints can be successfully used to verify the ownership of the model.
arXiv Detail & Related papers (2022-10-14T03:12:36Z) - FBI: Fingerprinting models with Benign Inputs [17.323638042215013]
This paper tackles the challenges to propose i) fingerprinting schemes that are resilient to significant modifications of the models, by generalizing to the notion of model families and their variants.
We achieve both goals by demonstrating that benign inputs, that are unmodified images, are sufficient material for both tasks.
Both approaches are experimentally validated over an unprecedented set of more than 1,000 networks.
arXiv Detail & Related papers (2022-08-05T13:55:36Z) - MOVE: Effective and Harmless Ownership Verification via Embedded
External Features [109.19238806106426]
We propose an effective and harmless model ownership verification (MOVE) to defend against different types of model stealing simultaneously.
We conduct the ownership verification by verifying whether a suspicious model contains the knowledge of defender-specified external features.
In particular, we develop our MOVE method under both white-box and black-box settings to provide comprehensive model protection.
arXiv Detail & Related papers (2022-08-04T02:22:29Z) - Defending against Model Stealing via Verifying Embedded External
Features [90.29429679125508]
adversaries can steal' deployed models even when they have no training samples and can not get access to the model parameters or structures.
We explore the defense from another angle by verifying whether a suspicious model contains the knowledge of defender-specified emphexternal features.
Our method is effective in detecting different types of model stealing simultaneously, even if the stolen model is obtained via a multi-stage stealing process.
arXiv Detail & Related papers (2021-12-07T03:51:54Z) - Scalable Backdoor Detection in Neural Networks [61.39635364047679]
Deep learning models are vulnerable to Trojan attacks, where an attacker can install a backdoor during training time to make the resultant model misidentify samples contaminated with a small trigger patch.
We propose a novel trigger reverse-engineering based approach whose computational complexity does not scale with the number of labels, and is based on a measure that is both interpretable and universal across different network and patch types.
In experiments, we observe that our method achieves a perfect score in separating Trojaned models from pure models, which is an improvement over the current state-of-the art method.
arXiv Detail & Related papers (2020-06-10T04:12:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.