Black-Box Training Data Identification in GANs via Detector Networks
- URL: http://arxiv.org/abs/2310.12063v1
- Date: Wed, 18 Oct 2023 15:53:20 GMT
- Title: Black-Box Training Data Identification in GANs via Detector Networks
- Authors: Lukman Olagoke, Salil Vadhan, Seth Neel
- Abstract summary: We study whether given access to a trained GAN, as well as fresh samples from the underlying distribution, if it is possible for an attacker to efficiently identify if a given point is a member of the GAN's training data.
This is of interest for both reasons related to copyright, where a user may want to determine if their copyrighted data has been used to train a GAN, and in the study of data privacy, where the ability to detect training set membership is known as a membership inference attack.
We introduce a suite of membership inference attacks against GANs in the black-box setting and evaluate our attacks
- Score: 2.4554686192257424
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Since their inception Generative Adversarial Networks (GANs) have been
popular generative models across images, audio, video, and tabular data. In
this paper we study whether given access to a trained GAN, as well as fresh
samples from the underlying distribution, if it is possible for an attacker to
efficiently identify if a given point is a member of the GAN's training data.
This is of interest for both reasons related to copyright, where a user may
want to determine if their copyrighted data has been used to train a GAN, and
in the study of data privacy, where the ability to detect training set
membership is known as a membership inference attack. Unlike the majority of
prior work this paper investigates the privacy implications of using GANs in
black-box settings, where the attack only has access to samples from the
generator, rather than access to the discriminator as well. We introduce a
suite of membership inference attacks against GANs in the black-box setting and
evaluate our attacks on image GANs trained on the CIFAR10 dataset and tabular
GANs trained on genomic data. Our most successful attack, called The Detector,
involve training a second network to score samples based on their likelihood of
being generated by the GAN, as opposed to a fresh sample from the distribution.
We prove under a simple model of the generator that the detector is an
approximately optimal membership inference attack. Across a wide range of
tabular and image datasets, attacks, and GAN architectures, we find that
adversaries can orchestrate non-trivial privacy attacks when provided with
access to samples from the generator. At the same time, the attack success
achievable against GANs still appears to be lower compared to other generative
and discriminative models; this leaves the intriguing open question of whether
GANs are in fact more private, or if it is a matter of developing stronger
attacks.
Related papers
- Membership Inference Attacks against Synthetic Data through Overfitting
Detection [84.02632160692995]
We argue for a realistic MIA setting that assumes the attacker has some knowledge of the underlying data distribution.
We propose DOMIAS, a density-based MIA model that aims to infer membership by targeting local overfitting of the generative model.
arXiv Detail & Related papers (2023-02-24T11:27:39Z) - Pseudo Label-Guided Model Inversion Attack via Conditional Generative
Adversarial Network [102.21368201494909]
Model inversion (MI) attacks have raised increasing concerns about privacy.
Recent MI attacks leverage a generative adversarial network (GAN) as an image prior to narrow the search space.
We propose Pseudo Label-Guided MI (PLG-MI) attack via conditional GAN (cGAN)
arXiv Detail & Related papers (2023-02-20T07:29:34Z) - Backdoor Attack and Defense in Federated Generative Adversarial
Network-based Medical Image Synthesis [15.41200827860072]
Federated learning (FL) provides a way of training a central model using distributed data while keeping raw data locally.
It is vulnerable to backdoor attacks, an adversarial by poisoning training data.
Most backdoor attack strategies focus on classification models and centralized domains.
We propose FedDetect, an efficient and effective way of defending against the backdoor attack in the FL setting.
arXiv Detail & Related papers (2022-10-19T21:03:34Z) - Generative Models with Information-Theoretic Protection Against
Membership Inference Attacks [6.840474688871695]
Deep generative models, such as Generative Adversarial Networks (GANs), synthesize diverse high-fidelity data samples.
GANs may disclose private information from the data they are trained on, making them susceptible to adversarial attacks.
We propose an information theoretically motivated regularization term that prevents the generative model from overfitting to training data and encourages generalizability.
arXiv Detail & Related papers (2022-05-31T19:29:55Z) - Property Inference Attacks Against GANs [19.443816794076763]
We propose the first set of training dataset property inference attacks against generative adversarial networks (GANs)
A successful property inference attack can allow the adversary to gain extra knowledge of the target GAN's training dataset.
We propose a general attack pipeline that can be tailored to two attack scenarios, including the full black-box setting and partial black-box setting.
arXiv Detail & Related papers (2021-11-15T08:57:00Z) - This Person (Probably) Exists. Identity Membership Attacks Against GAN
Generated Faces [6.270305440413689]
generative adversarial networks (GANs) have achieved stunning realism, fooling even human observers.
GANs do leak information about their training data, as evidenced by membership attacks recently demonstrated in the literature.
In this work, we challenge the assumption that GAN faces really are novel creations, by constructing a successful membership attack of a new kind.
arXiv Detail & Related papers (2021-07-13T12:11:21Z) - DAAIN: Detection of Anomalous and Adversarial Input using Normalizing
Flows [52.31831255787147]
We introduce a novel technique, DAAIN, to detect out-of-distribution (OOD) inputs and adversarial attacks (AA)
Our approach monitors the inner workings of a neural network and learns a density estimator of the activation distribution.
Our model can be trained on a single GPU making it compute efficient and deployable without requiring specialized accelerators.
arXiv Detail & Related papers (2021-05-30T22:07:13Z) - Hidden Backdoor Attack against Semantic Segmentation Models [60.0327238844584]
The emphbackdoor attack intends to embed hidden backdoors in deep neural networks (DNNs) by poisoning training data.
We propose a novel attack paradigm, the emphfine-grained attack, where we treat the target label from the object-level instead of the image-level.
Experiments show that the proposed methods can successfully attack semantic segmentation models by poisoning only a small proportion of training data.
arXiv Detail & Related papers (2021-03-06T05:50:29Z) - Backdoor Attack against Speaker Verification [86.43395230456339]
We show that it is possible to inject the hidden backdoor for infecting speaker verification models by poisoning the training data.
We also demonstrate that existing backdoor attacks cannot be directly adopted in attacking speaker verification.
arXiv Detail & Related papers (2020-10-22T11:10:08Z) - Sampling Attacks: Amplification of Membership Inference Attacks by
Repeated Queries [74.59376038272661]
We introduce sampling attack, a novel membership inference technique that unlike other standard membership adversaries is able to work under severe restriction of no access to scores of the victim model.
We show that a victim model that only publishes the labels is still susceptible to sampling attacks and the adversary can recover up to 100% of its performance.
For defense, we choose differential privacy in the form of gradient perturbation during the training of the victim model as well as output perturbation at prediction time.
arXiv Detail & Related papers (2020-09-01T12:54:54Z) - privGAN: Protecting GANs from membership inference attacks at low cost [5.735035463793008]
Generative Adversarial Networks (GANs) have made releasing of synthetic images a viable approach to share data without releasing the original dataset.
Recent work has shown that the GAN models and their synthetically generated data can be used to infer the training set membership by an adversary.
Here we develop a new GAN architecture (privGAN) where the generator is trained not only to cheat the discriminator but also to defend membership inference attacks.
arXiv Detail & Related papers (2019-12-31T20:47:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.