On the Biometric Capacity of Generative Face Models
- URL: http://arxiv.org/abs/2308.02065v1
- Date: Thu, 3 Aug 2023 22:21:04 GMT
- Title: On the Biometric Capacity of Generative Face Models
- Authors: Vishnu Naresh Boddeti and Gautam Sreekumar and Arun Ross
- Abstract summary: This paper proposes a statistical approach to estimate the biometric capacity of generated face images.
We employ our approach on multiple generative models, including StyleGAN, Latent Diffusion Model, and "Generated Photos"
Capacity estimates indicate that (a) under ArcFace representation at a false acceptance rate (FAR) of 0.1%, StyleGAN3 and DCFace have a capacity upper bound of $1.43times106$ and $1.190times104$, respectively.
- Score: 23.66662504163745
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: There has been tremendous progress in generating realistic faces with high
fidelity over the past few years. Despite this progress, a crucial question
remains unanswered: "Given a generative face model, how many unique identities
can it generate?" In other words, what is the biometric capacity of the
generative face model? A scientific basis for answering this question will
benefit evaluating and comparing different generative face models and establish
an upper bound on their scalability. This paper proposes a statistical approach
to estimate the biometric capacity of generated face images in a hyperspherical
feature space. We employ our approach on multiple generative models, including
unconditional generators like StyleGAN, Latent Diffusion Model, and "Generated
Photos," as well as DCFace, a class-conditional generator. We also estimate
capacity w.r.t. demographic attributes such as gender and age. Our capacity
estimates indicate that (a) under ArcFace representation at a false acceptance
rate (FAR) of 0.1%, StyleGAN3 and DCFace have a capacity upper bound of
$1.43\times10^6$ and $1.190\times10^4$, respectively; (b) the capacity reduces
drastically as we lower the desired FAR with an estimate of $1.796\times10^4$
and $562$ at FAR of 1% and 10%, respectively, for StyleGAN3; (c) there is no
discernible disparity in the capacity w.r.t gender; and (d) for some generative
models, there is an appreciable disparity in the capacity w.r.t age. Code is
available at https://github.com/human-analysis/capacity-generative-face-models.
Related papers
- Are Image Distributions Indistinguishable to Humans Indistinguishable to Classifiers? [39.31679737754048]
We show that, in the eyes of classifiers parameterized by neural networks, the strongest diffusion models are still far from this goal.
Our comprehensive empirical study suggests that, unlike humans, classifiers tend to classify images through edge and high-frequency components.
arXiv Detail & Related papers (2024-05-28T10:25:06Z) - CapHuman: Capture Your Moments in Parallel Universes [60.06408546134581]
We present a new framework named CapHuman.
CapHuman encodes identity features and then learns to align them into the latent space.
We introduce the 3D facial prior to equip our model with control over the human head in a flexible and 3D-consistent manner.
arXiv Detail & Related papers (2024-02-01T14:41:59Z) - SwinFace: A Multi-task Transformer for Face Recognition, Expression
Recognition, Age Estimation and Attribute Estimation [60.94239810407917]
This paper presents a multi-purpose algorithm for simultaneous face recognition, facial expression recognition, age estimation, and face attribute estimation based on a single Swin Transformer.
To address the conflicts among multiple tasks, a Multi-Level Channel Attention (MLCA) module is integrated into each task-specific analysis.
Experiments show that the proposed model has a better understanding of the face and achieves excellent performance for all tasks.
arXiv Detail & Related papers (2023-08-22T15:38:39Z) - DCFace: Synthetic Face Generation with Dual Condition Diffusion Model [18.662943303044315]
We propose a Dual Condition Face Generator (DCFace) based on a diffusion model.
Our novel Patch-wise style extractor and Time-step dependent ID loss enables DCFace to consistently produce face images of the same subject under different styles with precise control.
arXiv Detail & Related papers (2023-04-14T11:31:49Z) - One-Shot Face Video Re-enactment using Hybrid Latent Spaces of StyleGAN2 [0.7614628596146599]
We propose an end-to-end framework for simultaneously supporting face edits, facial motions and deformations, and facial identity control for video generation.
We employ StyleGAN2 generator to achieve high-fidelity face video re-enactment at $10242$.
arXiv Detail & Related papers (2023-02-15T18:34:15Z) - Generated Faces in the Wild: Quantitative Comparison of Stable
Diffusion, Midjourney and DALL-E 2 [47.64219291655723]
We conduct a comparison of three popular systems including Stable Diffusion, Midjourney, and DALL-E 2 in their ability to generate photorealistic faces in the wild.
We find that Stable Diffusion generates better faces than the other systems, according to the FID score.
We also introduce a dataset of generated faces in the wild dubbed GFW, including a total of 15,076 faces.
arXiv Detail & Related papers (2022-10-02T17:53:08Z) - 3DMM-RF: Convolutional Radiance Fields for 3D Face Modeling [111.98096975078158]
We introduce a style-based generative network that synthesizes in one pass all and only the required rendering samples of a neural radiance field.
We show that this model can accurately be fit to "in-the-wild" facial images of arbitrary pose and illumination, extract the facial characteristics, and be used to re-render the face in controllable conditions.
arXiv Detail & Related papers (2022-09-15T15:28:45Z) - Multiface: A Dataset for Neural Face Rendering [108.44505415073579]
In this work, we present Multiface, a new multi-view, high-resolution human face dataset.
We introduce Mugsy, a large scale multi-camera apparatus to capture high-resolution synchronized videos of a facial performance.
The goal of Multiface is to close the gap in accessibility to high quality data in the academic community and to enable research in VR telepresence.
arXiv Detail & Related papers (2022-07-22T17:55:39Z) - DALL-Eval: Probing the Reasoning Skills and Social Biases of
Text-to-Image Generation Models [73.12069620086311]
We investigate the visual reasoning capabilities and social biases of text-to-image models.
First, we measure three visual reasoning skills: object recognition, object counting, and spatial relation understanding.
Second, we assess the gender and skin tone biases by measuring the gender/skin tone distribution of generated images.
arXiv Detail & Related papers (2022-02-08T18:36:52Z) - Cluster-guided Image Synthesis with Unconditional Models [41.89334167530054]
This work focuses on controllable image generation by leveraging GANs that are well-trained in an unsupervised fashion.
By conditioning on the cluster assignments, the proposed method is able to control the semantic class of the generated image.
We showcase the efficacy of our approach on faces (CelebA-HQ and FFHQ), animals (Imagenet) and objects (LSUN) using different pre-trained generative models.
arXiv Detail & Related papers (2021-12-24T02:18:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.