Two-Step Data Augmentation for Masked Face Detection and Recognition: Turning Fake Masks to Real
- URL: http://arxiv.org/abs/2512.15774v1
- Date: Sat, 13 Dec 2025 15:20:08 GMT
- Title: Two-Step Data Augmentation for Masked Face Detection and Recognition: Turning Fake Masks to Real
- Authors: Yan Yang, George Bebis, Mircea Nicolescu,
- Abstract summary: We propose a two-step generative data augmentation framework that combines rule-based mask warping with unpaired image-to-image translation using GANs.<n>Compared to rule-based warping alone, the proposed approach yields consistent qualitative improvements.<n>We introduce a non-mask loss preservation and noise injection to stabilize training and enhance sample diversity.
- Score: 5.1215389305751735
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Data scarcity and distribution shift pose major challenges for masked face detection and recognition. We propose a two-step generative data augmentation framework that combines rule-based mask warping with unpaired image-to-image translation using GANs, enabling the generation of realistic masked-face samples beyond purely synthetic transformations. Compared to rule-based warping alone, the proposed approach yields consistent qualitative improvements and complements existing GAN-based masked face generation methods such as IAMGAN. We introduce a non-mask preservation loss and stochastic noise injection to stabilize training and enhance sample diversity. Experimental observations highlight the effectiveness of the proposed components and suggest directions for future improvements in data-centric augmentation for face recognition tasks.
Related papers
- Diffusion-Guided Mask-Consistent Paired Mixing for Endoscopic Image Segmentation [57.37991748282666]
We propose a paired, diffusion-guided paradigm that fuses the strengths of sample mixing and diffusion synthesis.<n>For each real image, a synthetic counterpart is generated under the same mask and the pair is used as a controllable input for Mask-Consistent Paired Mixing (MCPMix)<n>This produces a continuous family of intermediate samples that smoothly bridges synthetic and real appearances under shared geometry.
arXiv Detail & Related papers (2025-11-05T06:14:19Z) - Imperceptible Face Forgery Attack via Adversarial Semantic Mask [59.23247545399068]
We propose an Adversarial Semantic Mask Attack framework (ASMA) which can generate adversarial examples with good transferability and invisibility.
Specifically, we propose a novel adversarial semantic mask generative model, which can constrain generated perturbations in local semantic regions for good stealthiness.
arXiv Detail & Related papers (2024-06-16T10:38:11Z) - DiffusionFace: Towards a Comprehensive Dataset for Diffusion-Based Face Forgery Analysis [71.40724659748787]
DiffusionFace is the first diffusion-based face forgery dataset.
It covers various forgery categories, including unconditional and Text Guide facial image generation, Img2Img, Inpaint, and Diffusion-based facial exchange algorithms.
It provides essential metadata and a real-world internet-sourced forgery facial image dataset for evaluation.
arXiv Detail & Related papers (2024-03-27T11:32:44Z) - DeepFidelity: Perceptual Forgery Fidelity Assessment for Deepfake
Detection [67.3143177137102]
Deepfake detection refers to detecting artificially generated or edited faces in images or videos.
We propose a novel Deepfake detection framework named DeepFidelity to adaptively distinguish real and fake faces.
arXiv Detail & Related papers (2023-12-07T07:19:45Z) - Semantic-aware One-shot Face Re-enactment with Dense Correspondence
Estimation [100.60938767993088]
One-shot face re-enactment is a challenging task due to the identity mismatch between source and driving faces.
This paper proposes to use 3D Morphable Model (3DMM) for explicit facial semantic decomposition and identity disentanglement.
arXiv Detail & Related papers (2022-11-23T03:02:34Z) - Calibrated Hyperspectral Image Reconstruction via Graph-based
Self-Tuning Network [40.71031760929464]
Hyperspectral imaging (HSI) has attracted increasing research attention, especially for the ones based on a coded snapshot spectral imaging (CASSI) system.
Existing deep HSI reconstruction models are generally trained on paired data to retrieve original signals upon 2D compressed measurements given by a particular optical hardware mask in CASSI.
This mask-specific training style will lead to a hardware miscalibration issue, which sets up barriers to deploying deep HSI models among different hardware and noisy environments.
We propose a novel Graph-based Self-Tuning ( GST) network to reason uncertainties adapting to varying spatial structures of masks among
arXiv Detail & Related papers (2021-12-31T09:39:13Z) - Development of a face mask detection pipeline for mask-wearing
monitoring in the era of the COVID-19 pandemic: A modular approach [0.0]
During the SARS-Cov-2 pandemic, mask-wearing became an effective tool to prevent spreading and contracting the virus.
The ability to monitor the mask-wearing rate in the population would be useful for determining public health strategies against the virus.
We present a two-step face mask detection approach consisting of two separate modules: 1) face detection and alignment and 2) face mask classification.
arXiv Detail & Related papers (2021-12-30T12:32:33Z) - Mask-invariant Face Recognition through Template-level Knowledge
Distillation [3.727773051465455]
Masks affect the performance of previous face recognition systems.
We propose a mask-invariant face recognition solution (MaskInv)
In addition to the distilled knowledge, the student network benefits from additional guidance by margin-based identity classification loss.
arXiv Detail & Related papers (2021-12-10T16:19:28Z) - Unmasking Face Embeddings by Self-restrained Triplet Loss for Accurate
Masked Face Recognition [6.865656740940772]
We present a solution to improve the masked face recognition performance.
Specifically, we propose the Embedding Unmasking Model (EUM) operated on top of existing face recognition models.
We also propose a novel loss function, the Self-restrained Triplet (SRT), which enabled the EUM to produce embeddings similar to these of unmasked faces of the same identities.
arXiv Detail & Related papers (2021-03-02T13:43:11Z) - Joint Deep Learning of Facial Expression Synthesis and Recognition [97.19528464266824]
We propose a novel joint deep learning of facial expression synthesis and recognition method for effective FER.
The proposed method involves a two-stage learning procedure. Firstly, a facial expression synthesis generative adversarial network (FESGAN) is pre-trained to generate facial images with different facial expressions.
In order to alleviate the problem of data bias between the real images and the synthetic images, we propose an intra-class loss with a novel real data-guided back-propagation (RDBP) algorithm.
arXiv Detail & Related papers (2020-02-06T10:56:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.