Related papers: Fingerprinting Deep Neural Networks Globally via Universal Adversarial Perturbations

Fingerprinting Deep Neural Networks Globally via Universal Adversarial Perturbations

URL: http://arxiv.org/abs/2202.08602v1
Date: Thu, 17 Feb 2022 11:29:50 GMT
Title: Fingerprinting Deep Neural Networks Globally via Universal Adversarial Perturbations
Authors: Zirui Peng and Shaofeng Li and Guoxing Chen and Cheng Zhang and Haojin Zhu and Minhui Xue
Abstract summary: We propose a novel and practical mechanism which enables the service provider to verify whether a suspect model is stolen from the victim model. Our framework can detect model IP breaches with confidence 99.99 %$ within only $20$ fingerprints of the suspect model.
Score: 22.89321897726347
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In this paper, we propose a novel and practical mechanism which enables the service provider to verify whether a suspect model is stolen from the victim model via model extraction attacks. Our key insight is that the profile of a DNN model's decision boundary can be uniquely characterized by its \textit{Universal Adversarial Perturbations (UAPs)}. UAPs belong to a low-dimensional subspace and piracy models' subspaces are more consistent with victim model's subspace compared with non-piracy model. Based on this, we propose a UAP fingerprinting method for DNN models and train an encoder via \textit{contrastive learning} that takes fingerprint as inputs, outputs a similarity score. Extensive studies show that our framework can detect model IP breaches with confidence $> 99.99 \%$ within only $20$ fingerprints of the suspect model. It has good generalizability across different model architectures and is robust against post-modifications on stolen models.

Related papers

Adversarial Example Based Fingerprinting for Robust Copyright Protection in Split Learning [17.08424946015621]
We propose the first copyright protection scheme for Split Learning model, leveraging fingerprint to ensure effective and robust copyright protection. This is demonstrated by a remarkable fingerprint verification success rate (FVSR) of 100% on MNIST, 98% on CIFAR-10, and 100% on ImageNet.
arXiv Detail & Related papers (2025-03-05T06:07:16Z)
Scalable Fingerprinting of Large Language Models [46.26999419117367]
We introduce a new method, dubbed Perinucleus sampling, to generate scalable, persistent, and harmless fingerprints. We demonstrate that this scheme can add 24,576 fingerprints to a Llama-3.1-8B model without degrading the model's utility.
arXiv Detail & Related papers (2025-02-11T18:43:07Z)
Sample Correlation for Fingerprinting Deep Face Recognition [83.53005932513156]
We propose a novel model stealing detection method based on SA Corremplelation (SAC) SAC successfully defends against various model stealing attacks in deep face recognition, encompassing face verification and face emotion recognition, exhibiting the highest performance in terms of AUC, p-value and F1 score. We extend our evaluation of SAC-JC to object recognition including Tiny-ImageNet and CIFAR10, which also demonstrates the superior performance of SAC-JC to previous methods.
arXiv Detail & Related papers (2024-12-30T07:37:06Z)
Model Pairing Using Embedding Translation for Backdoor Attack Detection on Open-Set Classification Tasks [63.269788236474234]
We propose to use model pairs on open-set classification tasks for detecting backdoors. We show that this score, can be an indicator for the presence of a backdoor despite models being of different architectures. This technique allows for the detection of backdoors on models designed for open-set classification tasks, which is little studied in the literature.
arXiv Detail & Related papers (2024-02-28T21:29:16Z)
Who Leaked the Model? Tracking IP Infringers in Accountable Federated Learning [51.26221422507554]
Federated learning (FL) is an effective collaborative learning framework to coordinate data and computation resources from massive and distributed clients in training. Such collaboration results in non-trivial intellectual property (IP) represented by the model parameters that should be protected and shared by the whole party rather than an individual user. To block such IP leakage, it is essential to make the IP identifiable in the shared model and locate the anonymous infringer who first leaks it. We propose Decodable Unique Watermarking (DUW) for complying with the requirements of accountable FL.
arXiv Detail & Related papers (2023-12-06T00:47:55Z)
NaturalFinger: Generating Natural Fingerprint with Generative Adversarial Networks [4.536351805614037]
We propose NaturalFinger which generates natural fingerprint with generative adversarial networks (GANs) Our approach achieves 0.91 ARUC value on the FingerBench dataset (154 models), exceeding the optimal baseline (MetaV) over 17%.
arXiv Detail & Related papers (2023-05-29T03:17:03Z)
Are You Stealing My Model? Sample Correlation for Fingerprinting Deep Neural Networks [86.55317144826179]
Previous methods always leverage the transferable adversarial examples as the model fingerprint. We propose a novel yet simple model stealing detection method based on SAmple Correlation (SAC) SAC successfully defends against various model stealing attacks, even including adversarial training or transfer learning.
arXiv Detail & Related papers (2022-10-21T02:07:50Z)
InFIP: An Explainable DNN Intellectual Property Protection Method based on Intrinsic Features [12.037142903022891]
We propose an interpretable intellectual property protection method for Deep Neural Networks (DNNs) based on explainable artificial intelligence. The proposed method does not modify the DNN model, and the decision of the ownership verification is interpretable. Experimental results demonstrate that the fingerprints can be successfully used to verify the ownership of the model.
arXiv Detail & Related papers (2022-10-14T03:12:36Z)
FBI: Fingerprinting models with Benign Inputs [17.323638042215013]
This paper tackles the challenges to propose i) fingerprinting schemes that are resilient to significant modifications of the models, by generalizing to the notion of model families and their variants. We achieve both goals by demonstrating that benign inputs, that are unmodified images, are sufficient material for both tasks. Both approaches are experimentally validated over an unprecedented set of more than 1,000 networks.
arXiv Detail & Related papers (2022-08-05T13:55:36Z)
MOVE: Effective and Harmless Ownership Verification via Embedded External Features [109.19238806106426]
We propose an effective and harmless model ownership verification (MOVE) to defend against different types of model stealing simultaneously. We conduct the ownership verification by verifying whether a suspicious model contains the knowledge of defender-specified external features. In particular, we develop our MOVE method under both white-box and black-box settings to provide comprehensive model protection.
arXiv Detail & Related papers (2022-08-04T02:22:29Z)
Defending against Model Stealing via Verifying Embedded External Features [90.29429679125508]
adversaries can steal' deployed models even when they have no training samples and can not get access to the model parameters or structures. We explore the defense from another angle by verifying whether a suspicious model contains the knowledge of defender-specified emphexternal features. Our method is effective in detecting different types of model stealing simultaneously, even if the stolen model is obtained via a multi-stage stealing process.
arXiv Detail & Related papers (2021-12-07T03:51:54Z)
Scalable Backdoor Detection in Neural Networks [61.39635364047679]
Deep learning models are vulnerable to Trojan attacks, where an attacker can install a backdoor during training time to make the resultant model misidentify samples contaminated with a small trigger patch. We propose a novel trigger reverse-engineering based approach whose computational complexity does not scale with the number of labels, and is based on a measure that is both interpretable and universal across different network and patch types. In experiments, we observe that our method achieves a perfect score in separating Trojaned models from pure models, which is an improvement over the current state-of-the art method.
arXiv Detail & Related papers (2020-06-10T04:12:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.