Haven't I Seen You Before? Assessing Identity Leakage in Synthetic
Irises
- URL: http://arxiv.org/abs/2211.05629v1
- Date: Thu, 3 Nov 2022 00:34:47 GMT
- Title: Haven't I Seen You Before? Assessing Identity Leakage in Synthetic
Irises
- Authors: Patrick Tinsley, Adam Czajka, Patrick Flynn
- Abstract summary: This paper presents analysis for three different iris matchers at varying points in the GAN training process to diagnose where and when authentic training samples are in jeopardy of leaking through the generative process.
Our results show that while most synthetic samples do not show signs of identity leakage, a handful of generated samples match authentic (training) samples nearly perfectly, with consensus across all matchers.
- Score: 4.142375560633827
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Generative Adversarial Networks (GANs) have proven to be a preferred method
of synthesizing fake images of objects, such as faces, animals, and
automobiles. It is not surprising these models can also generate ISO-compliant,
yet synthetic iris images, which can be used to augment training data for iris
matchers and liveness detectors. In this work, we trained one of the most
recent GAN models (StyleGAN3) to generate fake iris images with two primary
goals: (i) to understand the GAN's ability to produce "never-before-seen"
irises, and (ii) to investigate the phenomenon of identity leakage as a
function of the GAN's training time. Previous work has shown that personal
biometric data can inadvertently flow from training data into synthetic
samples, raising a privacy concern for subjects who accidentally appear in the
training dataset. This paper presents analysis for three different iris
matchers at varying points in the GAN training process to diagnose where and
when authentic training samples are in jeopardy of leaking through the
generative process. Our results show that while most synthetic samples do not
show signs of identity leakage, a handful of generated samples match authentic
(training) samples nearly perfectly, with consensus across all matchers. In
order to prioritize privacy, security, and trust in the machine learning model
development process, the research community must strike a delicate balance
between the benefits of using synthetic data and the corresponding threats
against privacy from potential identity leakage.
Related papers
- Privacy-Safe Iris Presentation Attack Detection [4.215251065887862]
This paper proposes a framework for a privacy-safe iris presentation attack detection (PAD) method.
It is designed solely with synthetically-generated, identity-leakage-free iris images.
The method is evaluated in a classical way using state-of-the-art iris PAD benchmarks.
arXiv Detail & Related papers (2024-08-05T18:09:02Z) - Synthetic Image Learning: Preserving Performance and Preventing Membership Inference Attacks [5.0243930429558885]
This paper introduces Knowledge Recycling (KR), a pipeline designed to optimise the generation and use of synthetic data for training downstream classifiers.
At the heart of this pipeline is Generative Knowledge Distillation (GKD), the proposed technique that significantly improves the quality and usefulness of the information.
The results show a significant reduction in the performance gap between models trained on real and synthetic data, with models based on synthetic data outperforming those trained on real data in some cases.
arXiv Detail & Related papers (2024-07-22T10:31:07Z) - Is synthetic data from generative models ready for image recognition? [69.42645602062024]
We study whether and how synthetic images generated from state-of-the-art text-to-image generation models can be used for image recognition tasks.
We showcase the powerfulness and shortcomings of synthetic data from existing generative models, and propose strategies for better applying synthetic data for recognition tasks.
arXiv Detail & Related papers (2022-10-14T06:54:24Z) - Delving into High-Quality Synthetic Face Occlusion Segmentation Datasets [83.749895930242]
We propose two techniques for producing high-quality naturalistic synthetic occluded faces.
We empirically show the effectiveness and robustness of both methods, even for unseen occlusions.
We present two high-resolution real-world occluded face datasets with fine-grained annotations, RealOcc and RealOcc-Wild.
arXiv Detail & Related papers (2022-05-12T17:03:57Z) - Generation of Non-Deterministic Synthetic Face Datasets Guided by
Identity Priors [19.095368725147367]
We propose a non-deterministic method for generating mated face images by exploiting the well-structured latent space of StyleGAN.
We create a new dataset of synthetic face images (SymFace) consisting of 77,034 samples including 25,919 synthetic IDs.
arXiv Detail & Related papers (2021-12-07T11:08:47Z) - A Deep Learning Generative Model Approach for Image Synthesis of Plant
Leaves [62.997667081978825]
We generate via advanced Deep Learning (DL) techniques artificial leaf images in an automatized way.
We aim to dispose of a source of training samples for AI applications for modern crop management.
arXiv Detail & Related papers (2021-11-05T10:53:35Z) - Mitigating Generation Shifts for Generalized Zero-Shot Learning [52.98182124310114]
Generalized Zero-Shot Learning (GZSL) is the task of leveraging semantic information (e.g., attributes) to recognize the seen and unseen samples, where unseen classes are not observable during training.
We propose a novel Generation Shifts Mitigating Flow framework for learning unseen data synthesis efficiently and effectively.
Experimental results demonstrate that GSMFlow achieves state-of-the-art recognition performance in both conventional and generalized zero-shot settings.
arXiv Detail & Related papers (2021-07-07T11:43:59Z) - Ensembling with Deep Generative Views [72.70801582346344]
generative models can synthesize "views" of artificial images that mimic real-world variations, such as changes in color or pose.
Here, we investigate whether such views can be applied to real images to benefit downstream analysis tasks such as image classification.
We use StyleGAN2 as the source of generative augmentations and investigate this setup on classification tasks involving facial attributes, cat faces, and cars.
arXiv Detail & Related papers (2021-04-29T17:58:35Z) - Ensembles of GANs for synthetic training data generation [7.835101177261939]
Insufficient training data is a major bottleneck for most deep learning practices.
This work investigates the use of synthetic images, created by generative adversarial networks (GANs), as the only source of training data.
arXiv Detail & Related papers (2021-04-23T19:38:48Z) - Identity-Aware CycleGAN for Face Photo-Sketch Synthesis and Recognition [61.87842307164351]
We first propose an Identity-Aware CycleGAN (IACycleGAN) model that applies a new perceptual loss to supervise the image generation network.
It improves CycleGAN on photo-sketch synthesis by paying more attention to the synthesis of key facial regions, such as eyes and nose.
We develop a mutual optimization procedure between the synthesis model and the recognition model, which iteratively synthesizes better images by IACycleGAN.
arXiv Detail & Related papers (2021-03-30T01:30:08Z) - Multi-Spectral Image Synthesis for Crop/Weed Segmentation in Precision
Farming [3.4788711710826083]
We propose an alternative solution with respect to the common data augmentation methods, applying it to the problem of crop/weed segmentation in precision farming.
We create semi-artificial samples by replacing the most relevant object classes (i.e., crop and weeds) with their synthesized counterparts.
In addition to RGB data, we take into account also near-infrared (NIR) information, generating four channel multi-spectral synthetic images.
arXiv Detail & Related papers (2020-09-12T08:49:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.