ConfounderGAN: Protecting Image Data Privacy with Causal Confounder
- URL: http://arxiv.org/abs/2212.01767v1
- Date: Sun, 4 Dec 2022 08:49:14 GMT
- Title: ConfounderGAN: Protecting Image Data Privacy with Causal Confounder
- Authors: Qi Tian, Kun Kuang, Kelu Jiang, Furui Liu, Zhihua Wang, Fei Wu
- Abstract summary: We propose ConfounderGAN, a generative adversarial network (GAN) that can make personal image data unlearnable to protect the data privacy of its owners.
Experiments are conducted in six image classification datasets, consisting of three natural object datasets and three medical datasets.
- Score: 85.6757153033139
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The success of deep learning is partly attributed to the availability of
massive data downloaded freely from the Internet. However, it also means that
users' private data may be collected by commercial organizations without
consent and used to train their models. Therefore, it's important and necessary
to develop a method or tool to prevent unauthorized data exploitation. In this
paper, we propose ConfounderGAN, a generative adversarial network (GAN) that
can make personal image data unlearnable to protect the data privacy of its
owners. Specifically, the noise produced by the generator for each image has
the confounder property. It can build spurious correlations between images and
labels, so that the model cannot learn the correct mapping from images to
labels in this noise-added dataset. Meanwhile, the discriminator is used to
ensure that the generated noise is small and imperceptible, thereby remaining
the normal utility of the encrypted image for humans. The experiments are
conducted in six image classification datasets, consisting of three natural
object datasets and three medical datasets. The results demonstrate that our
method not only outperforms state-of-the-art methods in standard settings, but
can also be applied to fast encryption scenarios. Moreover, we show a series of
transferability and stability experiments to further illustrate the
effectiveness and superiority of our method.
Related papers
- Towards Reliable Verification of Unauthorized Data Usage in Personalized Text-to-Image Diffusion Models [23.09033991200197]
New personalization techniques have been proposed to customize the pre-trained base models for crafting images with specific themes or styles.
Such a lightweight solution poses a new concern regarding whether the personalized models are trained from unauthorized data.
We introduce SIREN, a novel methodology to proactively trace unauthorized data usage in black-box personalized text-to-image diffusion models.
arXiv Detail & Related papers (2024-10-14T12:29:23Z) - EnTruth: Enhancing the Traceability of Unauthorized Dataset Usage in Text-to-image Diffusion Models with Minimal and Robust Alterations [73.94175015918059]
We introduce a novel approach, EnTruth, which Enhances Traceability of unauthorized dataset usage.
By strategically incorporating the template memorization, EnTruth can trigger the specific behavior in unauthorized models as the evidence of infringement.
Our method is the first to investigate the positive application of memorization and use it for copyright protection, which turns a curse into a blessing.
arXiv Detail & Related papers (2024-06-20T02:02:44Z) - Only My Model On My Data: A Privacy Preserving Approach Protecting one
Model and Deceiving Unauthorized Black-Box Models [11.59117790048892]
This study tackles an unexplored practical privacy preservation use case by generating human-perceivable images that maintain accurate inference by an authorized model.
Our results show that the generated images can successfully maintain the accuracy of a protected model and degrade the average accuracy of the unauthorized black-box models to 11.97%, 6.63%, and 55.51% on ImageNet, Celeba-HQ, and AffectNet datasets, respectively.
arXiv Detail & Related papers (2024-02-14T17:11:52Z) - DIAGNOSIS: Detecting Unauthorized Data Usages in Text-to-image Diffusion Models [79.71665540122498]
We propose a method for detecting unauthorized data usage by planting the injected content into the protected dataset.
Specifically, we modify the protected images by adding unique contents on these images using stealthy image warping functions.
By analyzing whether the model has memorized the injected content, we can detect models that had illegally utilized the unauthorized data.
arXiv Detail & Related papers (2023-07-06T16:27:39Z) - Attribute-preserving Face Dataset Anonymization via Latent Code
Optimization [64.4569739006591]
We present a task-agnostic anonymization procedure that directly optimize the images' latent representation in the latent space of a pre-trained GAN.
We demonstrate through a series of experiments that our method is capable of anonymizing the identity of the images whilst -- crucially -- better-preserving the facial attributes.
arXiv Detail & Related papers (2023-03-20T17:34:05Z) - Synthetic Dataset Generation for Privacy-Preserving Machine Learning [7.489265323050362]
We propose a method to generate secure synthetic datasets from the original private datasets.
We show that our proposed method preserves data-privacy under various privacy-leakage attacks.
arXiv Detail & Related papers (2022-10-06T20:54:52Z) - Content-Aware Differential Privacy with Conditional Invertible Neural
Networks [0.7102341019971402]
Invertible Neural Networks (INNs) have shown excellent generative performance while still providing the ability to quantify the exact likelihood.
We hypothesize that adding noise to the latent space of an INN can enable differentially private image modification.
We conduct experiments on publicly available benchmarking datasets as well as dedicated medical ones.
arXiv Detail & Related papers (2022-07-29T11:52:16Z) - Learning to See by Looking at Noise [87.12788334473295]
We investigate a suite of image generation models that produce images from simple random processes.
These are then used as training data for a visual representation learner with a contrastive loss.
Our findings show that it is important for the noise to capture certain structural properties of real data but that good performance can be achieved even with processes that are far from realistic.
arXiv Detail & Related papers (2021-06-10T17:56:46Z) - Data-driven Meta-set Based Fine-Grained Visual Classification [61.083706396575295]
We propose a data-driven meta-set based approach to deal with noisy web images for fine-grained recognition.
Specifically, guided by a small amount of clean meta-set, we train a selection net in a meta-learning manner to distinguish in- and out-of-distribution noisy images.
arXiv Detail & Related papers (2020-08-06T03:04:16Z) - Privacy-Preserving Image Classification in the Local Setting [17.375582978294105]
Local Differential Privacy (LDP) brings us a promising solution, which allows the data owners to randomly perturb their input to provide the plausible deniability of the data before releasing.
In this paper, we consider a two-party image classification problem, in which data owners hold the image and the untrustworthy data user would like to fit a machine learning model with these images as input.
We propose a supervised image feature extractor, DCAConv, which produces an image representation with scalable domain size.
arXiv Detail & Related papers (2020-02-09T01:25:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.