Related papers: Towards Model Extraction Attacks in GAN-Based Image Translation via Domain Shift Mitigation

Towards Model Extraction Attacks in GAN-Based Image Translation via Domain Shift Mitigation

URL: http://arxiv.org/abs/2403.07673v3
Date: Tue, 19 Mar 2024 09:02:34 GMT
Title: Towards Model Extraction Attacks in GAN-Based Image Translation via Domain Shift Mitigation
Authors: Di Mi, Yanjun Zhang, Leo Yu Zhang, Shengshan Hu, Qi Zhong, Haizhuan Yuan, Shirui Pan,
Abstract summary: Model extraction attacks (MEAs) enable an attacker to replicate the functionality of a victim deep neural network (DNN) model by querying its API service remotely. This paper unveils the threat of MEA in image-to-image translation (I2IT) tasks from a new perspective.
Score: 48.418293879875606
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Model extraction attacks (MEAs) enable an attacker to replicate the functionality of a victim deep neural network (DNN) model by only querying its API service remotely, posing a severe threat to the security and integrity of pay-per-query DNN-based services. Although the majority of current research on MEAs has primarily concentrated on neural classifiers, there is a growing prevalence of image-to-image translation (I2IT) tasks in our everyday activities. However, techniques developed for MEA of DNN classifiers cannot be directly transferred to the case of I2IT, rendering the vulnerability of I2IT models to MEA attacks often underestimated. This paper unveils the threat of MEA in I2IT tasks from a new perspective. Diverging from the traditional approach of bridging the distribution gap between attacker queries and victim training samples, we opt to mitigate the effect caused by the different distributions, known as the domain shift. This is achieved by introducing a new regularization term that penalizes high-frequency noise, and seeking a flatter minimum to avoid overfitting to the shifted distribution. Extensive experiments on different image translation tasks, including image super-resolution and style transfer, are performed on different backbone victim models, and the new design consistently outperforms the baseline by a large margin across all metrics. A few real-life I2IT APIs are also verified to be extremely vulnerable to our attack, emphasizing the need for enhanced defenses and potentially revised API publishing policies.

Related papers

Generating Transferrable Adversarial Examples via Local Mixing and Logits Optimization for Remote Sensing Object Recognition [5.6536225368328274]
Deep Neural Networks (DNNs) are vulnerable to adversarial attacks.<n>In this paper, we propose a novel framework via local mixing and logits optimization.<n>Our method achieves a 17.28% average improvement in black-box attack success rate.
arXiv Detail & Related papers (2025-09-09T08:20:19Z)
Do Spikes Protect Privacy? Investigating Black-Box Model Inversion Attacks in Spiking Neural Networks [0.0]
This work presents the first study of black-box Model Inversion (MI) attacks on Spiking Neural Networks (SNNs) We adapt a generative adversarial MI framework to the spiking domain by incorporating rate-based encoding for input transformation and decoding mechanisms for output interpretation. Our results show that SNNs exhibit significantly greater resistance to MI attacks than ANNs, as demonstrated by degraded reconstructions, increased instability in attack convergence, and overall reduced attack effectiveness across multiple evaluation metrics.
arXiv Detail & Related papers (2025-02-08T10:02:27Z)
AIM: Additional Image Guided Generation of Transferable Adversarial Attacks [72.24101555828256]
Transferable adversarial examples highlight the vulnerability of deep neural networks (DNNs) to imperceptible perturbations across various real-world applications. In this work, we focus on generative approaches for targeted transferable attacks. We introduce a novel plug-and-play module into the general generator architecture to enhance adversarial transferability.
arXiv Detail & Related papers (2025-01-02T07:06:49Z)
Learning to Learn Transferable Generative Attack for Person Re-Identification [17.26567195924685]
Existing attacks merely consider cross-dataset and cross-model transferability, ignoring the cross-test capability to perturb models trained in different domains. To powerfully examine the robustness of real-world re-id models, the Meta Transferable Generative Attack (MTGA) method is proposed. Our MTGA outperforms the SOTA methods by 21.5% and 11.3% on mean mAP drop rate, respectively.
arXiv Detail & Related papers (2024-09-06T11:57:17Z)
CAMH: Advancing Model Hijacking Attack in Machine Learning [44.58778557522968]
Category-Agnostic Model Hijacking (CAMH) is a novel model hijacking attack method. It addresses the challenges of class number mismatch, data distribution divergence, and performance balance between the original and hijacking tasks. We demonstrate its potent attack effectiveness while ensuring minimal degradation in the performance of the original task.
arXiv Detail & Related papers (2024-08-25T07:03:01Z)
A Closer Look at GAN Priors: Exploiting Intermediate Features for Enhanced Model Inversion Attacks [43.98557963966335]
Model Inversion (MI) attacks aim to reconstruct privacy-sensitive training data from released models by utilizing output information. Recent advances in generative adversarial networks (GANs) have contributed significantly to the improved performance of MI attacks. We propose a novel method, Intermediate Features enhanced Generative Model Inversion (IF-GMI), which disassembles the GAN structure and exploits features between intermediate blocks.
arXiv Detail & Related papers (2024-07-18T19:16:22Z)
Model Inversion Attacks Through Target-Specific Conditional Diffusion Models [54.69008212790426]
Model inversion attacks (MIAs) aim to reconstruct private images from a target classifier's training set, thereby raising privacy concerns in AI applications. Previous GAN-based MIAs tend to suffer from inferior generative fidelity due to GAN's inherent flaws and biased optimization within latent space. We propose Diffusion-based Model Inversion (Diff-MI) attacks to alleviate these issues.
arXiv Detail & Related papers (2024-07-16T06:38:49Z)
Towards Understanding and Boosting Adversarial Transferability from a Distribution Perspective [80.02256726279451]
adversarial attacks against Deep neural networks (DNNs) have received broad attention in recent years. We propose a novel method that crafts adversarial examples by manipulating the distribution of the image. Our method can significantly improve the transferability of the crafted attacks and achieves state-of-the-art performance in both untargeted and targeted scenarios.
arXiv Detail & Related papers (2022-10-09T09:58:51Z)
Interpolated Joint Space Adversarial Training for Robust and Generalizable Defenses [82.3052187788609]
Adversarial training (AT) is considered to be one of the most reliable defenses against adversarial attacks. Recent works show generalization improvement with adversarial samples under novel threat models. We propose a novel threat model called Joint Space Threat Model (JSTM) Under JSTM, we develop novel adversarial attacks and defenses.
arXiv Detail & Related papers (2021-12-12T21:08:14Z)
Towards Accurate Knowledge Transfer via Target-awareness Representation Disentanglement [56.40587594647692]
We propose a novel transfer learning algorithm, introducing the idea of Target-awareness REpresentation Disentanglement (TRED) TRED disentangles the relevant knowledge with respect to the target task from the original source model and used as a regularizer during fine-tuning the target model. Experiments on various real world datasets show that our method stably improves the standard fine-tuning by more than 2% in average.
arXiv Detail & Related papers (2020-10-16T17:45:08Z)
Perturbing Across the Feature Hierarchy to Improve Standard and Strict Blackbox Attack Transferability [100.91186458516941]
We consider the blackbox transfer-based targeted adversarial attack threat model in the realm of deep neural network (DNN) image classifiers. We design a flexible attack framework that allows for multi-layer perturbations and demonstrates state-of-the-art targeted transfer performance. We analyze why the proposed methods outperform existing attack strategies and show an extension of the method in the case when limited queries to the blackbox model are allowed.
arXiv Detail & Related papers (2020-04-29T16:00:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.