Black-Box Dissector: Towards Erasing-based Hard-Label Model Stealing
Attack
- URL: http://arxiv.org/abs/2105.00623v1
- Date: Mon, 3 May 2021 04:12:31 GMT
- Title: Black-Box Dissector: Towards Erasing-based Hard-Label Model Stealing
Attack
- Authors: Yixu Wang, Jie Li, Hong Liu, Yongjian Wu, Rongrong Ji
- Abstract summary: Model stealing attack aims to create a substitute model that steals the ability of the victim target model.
Most of the existing methods depend on the full probability outputs from the victim model, which is unavailable in most realistic scenarios.
We propose a novel hard-label model stealing method termed emphblack-box dissector, which includes a CAM-driven erasing strategy to mine the hidden information in hard labels from the victim model.
- Score: 90.6076825117532
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Model stealing attack aims to create a substitute model that steals the
ability of the victim target model. However, most of the existing methods
depend on the full probability outputs from the victim model, which is
unavailable in most realistic scenarios. Focusing on the more practical
hard-label setting, due to the lack of rich information in the probability
prediction, the existing methods suffer from catastrophic performance
degradation. Inspired by knowledge distillation, we propose a novel hard-label
model stealing method termed \emph{black-box dissector}, which includes a
CAM-driven erasing strategy to mine the hidden information in hard labels from
the victim model, and a random-erasing-based self-knowledge distillation module
utilizing soft labels from substitute model to avoid overfitting and
miscalibration caused by hard labels. Extensive experiments on four widely-used
datasets consistently show that our method outperforms state-of-the-art
methods, with an improvement of at most $9.92\%$. In addition, experiments on
real-world APIs further prove the effectiveness of our method. Our method also
can invalidate existing defense methods which further demonstrates the
practical potential of our methods.
Related papers
- Efficient Model Extraction via Boundary Sampling [2.9815109163161204]
This paper introduces a novel data-free model extraction attack.
It significantly advances the current state-of-the-art in terms of efficiency, accuracy, and effectiveness.
arXiv Detail & Related papers (2024-10-20T15:56:24Z) - EnTruth: Enhancing the Traceability of Unauthorized Dataset Usage in Text-to-image Diffusion Models with Minimal and Robust Alterations [73.94175015918059]
We introduce a novel approach, EnTruth, which Enhances Traceability of unauthorized dataset usage.
By strategically incorporating the template memorization, EnTruth can trigger the specific behavior in unauthorized models as the evidence of infringement.
Our method is the first to investigate the positive application of memorization and use it for copyright protection, which turns a curse into a blessing.
arXiv Detail & Related papers (2024-06-20T02:02:44Z) - Data-Free Hard-Label Robustness Stealing Attack [67.41281050467889]
We introduce a novel Data-Free Hard-Label Robustness Stealing (DFHL-RS) attack in this paper.
It enables the stealing of both model accuracy and robustness by simply querying hard labels of the target model.
Our method achieves a clean accuracy of 77.86% and a robust accuracy of 39.51% against AutoAttack.
arXiv Detail & Related papers (2023-12-10T16:14:02Z) - Segue: Side-information Guided Generative Unlearnable Examples for
Facial Privacy Protection in Real World [64.4289385463226]
We propose Segue: Side-information guided generative unlearnable examples.
To improve transferability, we introduce side information such as true labels and pseudo labels.
It can resist JPEG compression, adversarial training, and some standard data augmentations.
arXiv Detail & Related papers (2023-10-24T06:22:37Z) - SCME: A Self-Contrastive Method for Data-free and Query-Limited Model
Extraction Attack [18.998300969035885]
Model extraction attacks fool the target model by generating adversarial examples on a substitute model.
We propose a novel data-free model extraction method named SCME, which considers both the inter- and intra-class diversity in synthesizing fake data.
arXiv Detail & Related papers (2023-10-15T10:41:45Z) - Be Careful What You Smooth For: Label Smoothing Can Be a Privacy Shield but Also a Catalyst for Model Inversion Attacks [28.16799731196294]
We investigate the impact of label smoothing on model attacks (MIAs), which aim to generate class-representative samples.
We find that traditional label smoothing fosters MIAs, thereby increasing a model's privacy leakage.
We find that smoothing with negative factors counters this trend, impeding the extraction of class-related information and leading to privacy preservation.
arXiv Detail & Related papers (2023-10-10T11:51:12Z) - Unstoppable Attack: Label-Only Model Inversion via Conditional Diffusion
Model [14.834360664780709]
Model attacks (MIAs) aim to recover private data from inaccessible training sets of deep learning models.
This paper develops a novel MIA method, leveraging a conditional diffusion model (CDM) to recover representative samples under the target label.
Experimental results show that this method can generate similar and accurate samples to the target label, outperforming generators of previous approaches.
arXiv Detail & Related papers (2023-07-17T12:14:24Z) - Label-Retrieval-Augmented Diffusion Models for Learning from Noisy
Labels [61.97359362447732]
Learning from noisy labels is an important and long-standing problem in machine learning for real applications.
In this paper, we reformulate the label-noise problem from a generative-model perspective.
Our model achieves new state-of-the-art (SOTA) results on all the standard real-world benchmark datasets.
arXiv Detail & Related papers (2023-05-31T03:01:36Z) - MOVE: Effective and Harmless Ownership Verification via Embedded
External Features [109.19238806106426]
We propose an effective and harmless model ownership verification (MOVE) to defend against different types of model stealing simultaneously.
We conduct the ownership verification by verifying whether a suspicious model contains the knowledge of defender-specified external features.
In particular, we develop our MOVE method under both white-box and black-box settings to provide comprehensive model protection.
arXiv Detail & Related papers (2022-08-04T02:22:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.