Plug & Play Attacks: Towards Robust and Flexible Model Inversion Attacks
- URL: http://arxiv.org/abs/2201.12179v1
- Date: Fri, 28 Jan 2022 15:25:50 GMT
- Title: Plug & Play Attacks: Towards Robust and Flexible Model Inversion Attacks
- Authors: Lukas Struppek, Dominik Hintersdorf, Antonio De Almeida Correia,
Antonia Adler, Kristian Kersting
- Abstract summary: Model attacks (MIAs) aim to create synthetic images that reflect the class-wise characteristics from a target inversion's training data by exploiting the model's learned knowledge.
Previous research has developed generative MIAs using generative adversarial networks (GANs) as image priors tailored to a specific target model.
We present Plug & Play Attacks that loosen the dependency between the target model and image prior and enable the use of a single trained GAN to attack a broad range of targets.
- Score: 13.374754708543449
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Model inversion attacks (MIAs) aim to create synthetic images that reflect
the class-wise characteristics from a target classifier's training data by
exploiting the model's learned knowledge. Previous research has developed
generative MIAs using generative adversarial networks (GANs) as image priors
that are tailored to a specific target model. This makes the attacks time- and
resource-consuming, inflexible, and susceptible to distributional shifts
between datasets. To overcome these drawbacks, we present Plug & Play Attacks
that loosen the dependency between the target model and image prior and enable
the use of a single trained GAN to attack a broad range of targets with only
minor attack adjustments needed. Moreover, we show that powerful MIAs are
possible even with publicly available pre-trained GANs and under strong
distributional shifts, whereas previous approaches fail to produce meaningful
results. Our extensive evaluation confirms the improved robustness and
flexibility of Plug & Play Attacks and their ability to create high-quality
images revealing sensitive class characteristics.
Related papers
- Transferable Adversarial Attacks on SAM and Its Downstream Models [87.23908485521439]
This paper explores the feasibility of adversarial attacking various downstream models fine-tuned from the segment anything model (SAM)
To enhance the effectiveness of the adversarial attack towards models fine-tuned on unknown datasets, we propose a universal meta-initialization (UMI) algorithm.
arXiv Detail & Related papers (2024-10-26T15:04:04Z) - Adversarial Robustification via Text-to-Image Diffusion Models [56.37291240867549]
Adrial robustness has been conventionally believed as a challenging property to encode for neural networks.
We develop a scalable and model-agnostic solution to achieve adversarial robustness without using any data.
arXiv Detail & Related papers (2024-07-26T10:49:14Z) - Model Inversion Attacks Through Target-Specific Conditional Diffusion Models [54.69008212790426]
Model inversion attacks (MIAs) aim to reconstruct private images from a target classifier's training set, thereby raising privacy concerns in AI applications.
Previous GAN-based MIAs tend to suffer from inferior generative fidelity due to GAN's inherent flaws and biased optimization within latent space.
We propose Diffusion-based Model Inversion (Diff-MI) attacks to alleviate these issues.
arXiv Detail & Related papers (2024-07-16T06:38:49Z) - FACTUAL: A Novel Framework for Contrastive Learning Based Robust SAR Image Classification [10.911464455072391]
FACTUAL is a Contrastive Learning framework for Adversarial Training and robust SAR classification.
Our model achieves 99.7% accuracy on clean samples, and 89.6% on perturbed samples, both outperforming previous state-of-the-art methods.
arXiv Detail & Related papers (2024-04-04T06:20:22Z) - Practical Membership Inference Attacks Against Large-Scale Multi-Modal
Models: A Pilot Study [17.421886085918608]
Membership inference attacks (MIAs) aim to infer whether a data point has been used to train a machine learning model.
These attacks can be employed to identify potential privacy vulnerabilities and detect unauthorized use of personal data.
This paper takes a first step towards developing practical MIAs against large-scale multi-modal models.
arXiv Detail & Related papers (2023-09-29T19:38:40Z) - RelaxLoss: Defending Membership Inference Attacks without Losing Utility [68.48117818874155]
We propose a novel training framework based on a relaxed loss with a more achievable learning target.
RelaxLoss is applicable to any classification model with added benefits of easy implementation and negligible overhead.
Our approach consistently outperforms state-of-the-art defense mechanisms in terms of resilience against MIAs.
arXiv Detail & Related papers (2022-07-12T19:34:47Z) - Knowledge-Enriched Distributional Model Inversion Attacks [49.43828150561947]
Model inversion (MI) attacks are aimed at reconstructing training data from model parameters.
We present a novel inversion-specific GAN that can better distill knowledge useful for performing attacks on private models from public data.
Our experiments show that the combination of these techniques can significantly boost the success rate of the state-of-the-art MI attacks by 150%.
arXiv Detail & Related papers (2020-10-08T16:20:48Z) - Boosting Black-Box Attack with Partially Transferred Conditional
Adversarial Distribution [83.02632136860976]
We study black-box adversarial attacks against deep neural networks (DNNs)
We develop a novel mechanism of adversarial transferability, which is robust to the surrogate biases.
Experiments on benchmark datasets and attacking against real-world API demonstrate the superior attack performance of the proposed method.
arXiv Detail & Related papers (2020-06-15T16:45:27Z) - GAP++: Learning to generate target-conditioned adversarial examples [28.894143619182426]
Adversarial examples are perturbed inputs which can cause a serious threat for machine learning models.
We propose a more general-purpose framework which infers target-conditioned perturbations dependent on both input image and target label.
Our method achieves superior performance with single target attack models and obtains high fooling rates with small perturbation norms.
arXiv Detail & Related papers (2020-06-09T07:49:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.