Good Artists Copy, Great Artists Steal: Model Extraction Attacks Against
Image Translation Generative Adversarial Networks
- URL: http://arxiv.org/abs/2104.12623v1
- Date: Mon, 26 Apr 2021 14:50:59 GMT
- Title: Good Artists Copy, Great Artists Steal: Model Extraction Attacks Against
Image Translation Generative Adversarial Networks
- Authors: Sebastian Szyller, Vasisht Duddu, Tommi Gr\"ondahl, N. Asokan
- Abstract summary: We show the first model extraction attack against real-world generative adversarial network (GAN) image translation models.
The adversary is not required to know $F_V$'s architecture or any other information about it beyond its intended image translation task.
We evaluate the effectiveness of our attacks using three different instances of two popular categories of image translation.
- Score: 12.605607949417031
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Machine learning models are typically made available to potential client
users via inference APIs. Model extraction attacks occur when a malicious
client uses information gleaned from queries to the inference API of a victim
model $F_V$ to build a surrogate model $F_A$ that has comparable functionality.
Recent research has shown successful model extraction attacks against image
classification, and NLP models. In this paper, we show the first model
extraction attack against real-world generative adversarial network (GAN) image
translation models. We present a framework for conducting model extraction
attacks against image translation models, and show that the adversary can
successfully extract functional surrogate models. The adversary is not required
to know $F_V$'s architecture or any other information about it beyond its
intended image translation task, and queries $F_V$'s inference interface using
data drawn from the same domain as the training data for $F_V$. We evaluate the
effectiveness of our attacks using three different instances of two popular
categories of image translation: (1) Selfie-to-Anime and (2) Monet-to-Photo
(image style transfer), and (3) Super-Resolution (super resolution). Using
standard performance metrics for GANs, we show that our attacks are effective
in each of the three cases -- the differences between $F_V$ and $F_A$, compared
to the target are in the following ranges: Selfie-to-Anime: FID $13.36-68.66$,
Monet-to-Photo: FID $3.57-4.40$, and Super-Resolution: SSIM: $0.06-0.08$ and
PSNR: $1.43-4.46$. Furthermore, we conducted a large scale (125 participants)
user study on Selfie-to-Anime and Monet-to-Photo to show that human perception
of the images produced by the victim and surrogate models can be considered
equivalent, within an equivalence bound of Cohen's $d=0.3$.
Related papers
- The Efficacy of Transfer-based No-box Attacks on Image Watermarking: A Pragmatic Analysis [11.724935807582513]
We investigate the robustness of image watermarking methods in the no-box'' setting, where the attacker is assumed to have no knowledge about the watermarking model.
We show that when the configuration is mostly aligned, a simple non-optimization attack can already exceed the success of optimization-based efforts.
arXiv Detail & Related papers (2024-12-03T17:02:49Z) - $\mathbb{X}$-Sample Contrastive Loss: Improving Contrastive Learning with Sample Similarity Graphs [62.565573316667276]
We develop an objective that encodes how a sample relates to others.
We train vision models based on similarities in class or text caption descriptions.
Our objective appears to work particularly well in lower-data regimes, with gains over CLIP of $16.8%$ on ImageNet and $18.1%$ on ImageNet Real.
arXiv Detail & Related papers (2024-07-25T15:38:16Z) - SnapFusion: Text-to-Image Diffusion Model on Mobile Devices within Two
Seconds [88.06788636008051]
Text-to-image diffusion models can create stunning images from natural language descriptions that rival the work of professional artists and photographers.
These models are large, with complex network architectures and tens of denoising iterations, making them computationally expensive and slow to run.
We present a generic approach that unlocks running text-to-image diffusion models on mobile devices in less than $2$ seconds.
arXiv Detail & Related papers (2023-06-01T17:59:25Z) - Better Diffusion Models Further Improve Adversarial Training [97.44991845907708]
It has been recognized that the data generated by the diffusion probabilistic model (DDPM) improves adversarial training.
This paper gives an affirmative answer by employing the most recent diffusion model which has higher efficiency.
Our adversarially trained models achieve state-of-the-art performance on RobustBench using only generated data.
arXiv Detail & Related papers (2023-02-09T13:46:42Z) - Are You Stealing My Model? Sample Correlation for Fingerprinting Deep
Neural Networks [86.55317144826179]
Previous methods always leverage the transferable adversarial examples as the model fingerprint.
We propose a novel yet simple model stealing detection method based on SAmple Correlation (SAC)
SAC successfully defends against various model stealing attacks, even including adversarial training or transfer learning.
arXiv Detail & Related papers (2022-10-21T02:07:50Z) - ARIA: Adversarially Robust Image Attribution for Content Provenance [25.217001579437635]
We show how to generate valid adversarial images that can easily cause incorrect image attribution.
We then describe an approach to prevent imperceptible adversarial attacks on deep visual fingerprinting models.
The resulting models are substantially more robust, are accurate even on unperturbed images, and perform well even over a database with millions of images.
arXiv Detail & Related papers (2022-02-25T18:11:45Z) - Defending against Model Stealing via Verifying Embedded External
Features [90.29429679125508]
adversaries can steal' deployed models even when they have no training samples and can not get access to the model parameters or structures.
We explore the defense from another angle by verifying whether a suspicious model contains the knowledge of defender-specified emphexternal features.
Our method is effective in detecting different types of model stealing simultaneously, even if the stolen model is obtained via a multi-stage stealing process.
arXiv Detail & Related papers (2021-12-07T03:51:54Z) - Selection of Source Images Heavily Influences the Effectiveness of
Adversarial Attacks [2.6113528145137495]
We show that not every source image is equally suited for adversarial examples.
It is possible to have a difference of up to $12.5%$ in model-to-model transferability success.
We then take one of the first steps in evaluating the robustness of images used to create adversarial examples.
arXiv Detail & Related papers (2021-06-14T02:45:45Z) - Adversarial robustness against multiple $l_p$-threat models at the price
of one and how to quickly fine-tune robust models to another threat model [79.05253587566197]
Adrial training (AT) in order to achieve adversarial robustness wrt single $l_p$-threat models has been discussed extensively.
In this paper we develop a simple and efficient training scheme to achieve adversarial robustness against the union of $l_p$-threat models.
arXiv Detail & Related papers (2021-05-26T12:20:47Z) - The Effects of Image Distribution and Task on Adversarial Robustness [4.597864989500202]
We propose an adaptation to the area under the curve (AUC) metric to measure the adversarial robustness of a model.
We used this adversarial robustness metric on models of an MNIST, CIFAR-10, and a Fusion dataset.
arXiv Detail & Related papers (2021-02-21T07:15:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.