Haven't I Seen You Before? Assessing Identity Leakage in Synthetic
Irises
- URL: http://arxiv.org/abs/2211.05629v1
- Date: Thu, 3 Nov 2022 00:34:47 GMT
- Title: Haven't I Seen You Before? Assessing Identity Leakage in Synthetic
Irises
- Authors: Patrick Tinsley, Adam Czajka, Patrick Flynn
- Abstract summary: This paper presents analysis for three different iris matchers at varying points in the GAN training process to diagnose where and when authentic training samples are in jeopardy of leaking through the generative process.
Our results show that while most synthetic samples do not show signs of identity leakage, a handful of generated samples match authentic (training) samples nearly perfectly, with consensus across all matchers.
- Score: 4.142375560633827
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Generative Adversarial Networks (GANs) have proven to be a preferred method
of synthesizing fake images of objects, such as faces, animals, and
automobiles. It is not surprising these models can also generate ISO-compliant,
yet synthetic iris images, which can be used to augment training data for iris
matchers and liveness detectors. In this work, we trained one of the most
recent GAN models (StyleGAN3) to generate fake iris images with two primary
goals: (i) to understand the GAN's ability to produce "never-before-seen"
irises, and (ii) to investigate the phenomenon of identity leakage as a
function of the GAN's training time. Previous work has shown that personal
biometric data can inadvertently flow from training data into synthetic
samples, raising a privacy concern for subjects who accidentally appear in the
training dataset. This paper presents analysis for three different iris
matchers at varying points in the GAN training process to diagnose where and
when authentic training samples are in jeopardy of leaking through the
generative process. Our results show that while most synthetic samples do not
show signs of identity leakage, a handful of generated samples match authentic
(training) samples nearly perfectly, with consensus across all matchers. In
order to prioritize privacy, security, and trust in the machine learning model
development process, the research community must strike a delicate balance
between the benefits of using synthetic data and the corresponding threats
against privacy from potential identity leakage.
Related papers
- Leveraging Programmatically Generated Synthetic Data for Differentially Private Diffusion Training [4.815212947276105]
Programmatically generated synthetic data has been used in differential private training for classification to avoid privacy leakage.
The model trained with synthetic data generates unrealistic random images, raising challenges to adapt synthetic data for generative models.
We propose DPSynGen, which leverages generated synthetic data in diffusion models to address this challenge.
arXiv Detail & Related papers (2024-12-13T04:22:23Z) - Second FRCSyn-onGoing: Winning Solutions and Post-Challenge Analysis to Improve Face Recognition with Synthetic Data [104.30479583607918]
2nd FRCSyn-onGoing challenge is based on the 2nd Face Recognition Challenge in the Era of Synthetic Data (FRCSyn), originally launched at CVPR 2024.
We focus on exploring the use of synthetic data both individually and in combination with real data to solve current challenges in face recognition.
arXiv Detail & Related papers (2024-12-02T11:12:01Z) - Understanding and Improving Training-Free AI-Generated Image Detections with Vision Foundation Models [68.90917438865078]
Deepfake techniques for facial synthesis and editing pose serious risks for generative models.
In this paper, we investigate how detection performance varies across model backbones, types, and datasets.
We introduce Contrastive Blur, which enhances performance on facial images, and MINDER, which addresses noise type bias, balancing performance across domains.
arXiv Detail & Related papers (2024-11-28T13:04:45Z) - Privacy-Safe Iris Presentation Attack Detection [4.215251065887862]
This paper proposes a framework for a privacy-safe iris presentation attack detection (PAD) method.
It is designed solely with synthetically-generated, identity-leakage-free iris images.
The method is evaluated in a classical way using state-of-the-art iris PAD benchmarks.
arXiv Detail & Related papers (2024-08-05T18:09:02Z) - Synthetic Image Learning: Preserving Performance and Preventing Membership Inference Attacks [5.0243930429558885]
This paper introduces Knowledge Recycling (KR), a pipeline designed to optimise the generation and use of synthetic data for training downstream classifiers.
At the heart of this pipeline is Generative Knowledge Distillation (GKD), the proposed technique that significantly improves the quality and usefulness of the information.
The results show a significant reduction in the performance gap between models trained on real and synthetic data, with models based on synthetic data outperforming those trained on real data in some cases.
arXiv Detail & Related papers (2024-07-22T10:31:07Z) - Is synthetic data from generative models ready for image recognition? [69.42645602062024]
We study whether and how synthetic images generated from state-of-the-art text-to-image generation models can be used for image recognition tasks.
We showcase the powerfulness and shortcomings of synthetic data from existing generative models, and propose strategies for better applying synthetic data for recognition tasks.
arXiv Detail & Related papers (2022-10-14T06:54:24Z) - Generation of Non-Deterministic Synthetic Face Datasets Guided by
Identity Priors [19.095368725147367]
We propose a non-deterministic method for generating mated face images by exploiting the well-structured latent space of StyleGAN.
We create a new dataset of synthetic face images (SymFace) consisting of 77,034 samples including 25,919 synthetic IDs.
arXiv Detail & Related papers (2021-12-07T11:08:47Z) - A Deep Learning Generative Model Approach for Image Synthesis of Plant
Leaves [62.997667081978825]
We generate via advanced Deep Learning (DL) techniques artificial leaf images in an automatized way.
We aim to dispose of a source of training samples for AI applications for modern crop management.
arXiv Detail & Related papers (2021-11-05T10:53:35Z) - Mitigating Generation Shifts for Generalized Zero-Shot Learning [52.98182124310114]
Generalized Zero-Shot Learning (GZSL) is the task of leveraging semantic information (e.g., attributes) to recognize the seen and unseen samples, where unseen classes are not observable during training.
We propose a novel Generation Shifts Mitigating Flow framework for learning unseen data synthesis efficiently and effectively.
Experimental results demonstrate that GSMFlow achieves state-of-the-art recognition performance in both conventional and generalized zero-shot settings.
arXiv Detail & Related papers (2021-07-07T11:43:59Z) - Identity-Aware CycleGAN for Face Photo-Sketch Synthesis and Recognition [61.87842307164351]
We first propose an Identity-Aware CycleGAN (IACycleGAN) model that applies a new perceptual loss to supervise the image generation network.
It improves CycleGAN on photo-sketch synthesis by paying more attention to the synthesis of key facial regions, such as eyes and nose.
We develop a mutual optimization procedure between the synthesis model and the recognition model, which iteratively synthesizes better images by IACycleGAN.
arXiv Detail & Related papers (2021-03-30T01:30:08Z) - Multi-Spectral Image Synthesis for Crop/Weed Segmentation in Precision
Farming [3.4788711710826083]
We propose an alternative solution with respect to the common data augmentation methods, applying it to the problem of crop/weed segmentation in precision farming.
We create semi-artificial samples by replacing the most relevant object classes (i.e., crop and weeds) with their synthesized counterparts.
In addition to RGB data, we take into account also near-infrared (NIR) information, generating four channel multi-spectral synthetic images.
arXiv Detail & Related papers (2020-09-12T08:49:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.