Designing and Generating Diverse, Equitable Face Image Datasets for Face Verification Tasks
- URL: http://arxiv.org/abs/2511.17393v1
- Date: Fri, 21 Nov 2025 16:53:08 GMT
- Title: Designing and Generating Diverse, Equitable Face Image Datasets for Face Verification Tasks
- Authors: Georgia Baltsou, Ioannis Sarridis, Christos Koutlis, Symeon Papadopoulos,
- Abstract summary: We propose a comprehensive methodology that integrates advanced generative models to create varied and diverse high-quality synthetic face images.<n>This work not only enriches the discussion on diversity and ethics in artificial intelligence but also lays the foundation for developing more inclusive and reliable face verification technologies.
- Score: 16.06801103297664
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Face verification is a significant component of identity authentication in various applications including online banking and secure access to personal devices. The majority of the existing face image datasets often suffer from notable biases related to race, gender, and other demographic characteristics, limiting the effectiveness and fairness of face verification systems. In response to these challenges, we propose a comprehensive methodology that integrates advanced generative models to create varied and diverse high-quality synthetic face images. This methodology emphasizes the representation of a diverse range of facial traits, ensuring adherence to characteristics permissible in identity card photographs. Furthermore, we introduce the Diverse and Inclusive Faces for Verification (DIF-V) dataset, comprising 27,780 images of 926 unique identities, designed as a benchmark for future research in face verification. Our analysis reveals that existing verification models exhibit biases toward certain genders and races, and notably, applying identity style modifications negatively impacts model performance. By tackling the inherent inequities in existing datasets, this work not only enriches the discussion on diversity and ethics in artificial intelligence but also lays the foundation for developing more inclusive and reliable face verification technologies
Related papers
- ID$^3$: Identity-Preserving-yet-Diversified Diffusion Models for Synthetic Face Recognition [60.15830516741776]
Synthetic face recognition (SFR) aims to generate datasets that mimic the distribution of real face data.
We introduce a diffusion-fueled SFR model termed $textID3$.
$textID3$ employs an ID-preserving loss to generate diverse yet identity-consistent facial appearances.
arXiv Detail & Related papers (2024-09-26T06:46:40Z) - Towards Inclusive Face Recognition Through Synthetic Ethnicity Alteration [11.451395489475647]
We explore ethnicity alteration and skin tone modification using synthetic face image generation methods to increase the diversity of datasets.
We conduct a detailed analysis by first constructing a balanced face image dataset representing three ethnicities: Asian, Black, and Indian.
We then make use of existing Generative Adversarial Network-based (GAN) image-to-image translation and manifold learning models to alter the ethnicity from one to another.
arXiv Detail & Related papers (2024-05-02T13:31:09Z) - SDFD: Building a Versatile Synthetic Face Image Dataset with Diverse Attributes [14.966767182001755]
We propose a methodology for generating synthetic face image datasets that capture a broader spectrum of facial diversity.
Specifically, our approach integrates demographics and biometrics but also non-permanent traits like make-up, hairstyle, and accessories.
These prompts guide a state-of-the-art text-to-image model in generating a comprehensive dataset of high-quality realistic images.
arXiv Detail & Related papers (2024-04-26T08:51:31Z) - ConsistentID: Portrait Generation with Multimodal Fine-Grained Identity Preserving [64.90148669690228]
ConsistentID is an innovative method crafted for diverseidentity-preserving portrait generation under fine-grained multimodal facial prompts.<n>We present a fine-grained portrait dataset, FGID, with over 500,000 facial images, offering greater diversity and comprehensiveness than existing public facial datasets.
arXiv Detail & Related papers (2024-04-25T17:23:43Z) - DeepFidelity: Perceptual Forgery Fidelity Assessment for Deepfake
Detection [67.3143177137102]
Deepfake detection refers to detecting artificially generated or edited faces in images or videos.
We propose a novel Deepfake detection framework named DeepFidelity to adaptively distinguish real and fake faces.
arXiv Detail & Related papers (2023-12-07T07:19:45Z) - StyleID: Identity Disentanglement for Anonymizing Faces [4.048444203617942]
The main contribution of the paper is the design of a feature-preserving anonymization framework, StyleID.
As part of the contribution, we present a novel disentanglement metric, three complementing disentanglement methods, and new insights into identity disentanglement.
StyleID provides tunable privacy, has low computational complexity, and is shown to outperform current state-of-the-art solutions.
arXiv Detail & Related papers (2022-12-28T12:04:24Z) - CIAO! A Contrastive Adaptation Mechanism for Non-Universal Facial
Expression Recognition [80.07590100872548]
We propose Contrastive Inhibitory Adaptati On (CIAO), a mechanism that adapts the last layer of facial encoders to depict specific affective characteristics on different datasets.
CIAO presents an improvement in facial expression recognition performance over six different datasets with very unique affective representations.
arXiv Detail & Related papers (2022-08-10T15:46:05Z) - SynFace: Face Recognition with Synthetic Data [83.15838126703719]
We devise the SynFace with identity mixup (IM) and domain mixup (DM) to mitigate the performance gap.
We also perform a systematically empirical analysis on synthetic face images to provide some insights on how to effectively utilize synthetic data for face recognition.
arXiv Detail & Related papers (2021-08-18T03:41:54Z) - IdentityDP: Differential Private Identification Protection for Face
Images [17.33916392050051]
Face de-identification, also known as face anonymization, refers to generating another image with similar appearance and the same background, while the real identity is hidden.
We propose IdentityDP, a face anonymization framework that combines a data-driven deep neural network with a differential privacy mechanism.
Our model can effectively obfuscate the identity-related information of faces, preserve significant visual similarity, and generate high-quality images.
arXiv Detail & Related papers (2021-03-02T14:26:00Z) - DotFAN: A Domain-transferred Face Augmentation Network for Pose and
Illumination Invariant Face Recognition [94.96686189033869]
We propose a 3D model-assisted domain-transferred face augmentation network (DotFAN)
DotFAN can generate a series of variants of an input face based on the knowledge distilled from existing rich face datasets collected from other domains.
Experiments show that DotFAN is beneficial for augmenting small face datasets to improve their within-class diversity.
arXiv Detail & Related papers (2020-02-23T08:16:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.