Towards robustness under occlusion for face recognition
- URL: http://arxiv.org/abs/2109.09083v1
- Date: Sun, 19 Sep 2021 08:27:57 GMT
- Title: Towards robustness under occlusion for face recognition
- Authors: Tomas M. Borges and Teofilo E. de Campos and Ricardo de Queiroz
- Abstract summary: In this paper, we evaluate the effects of occlusions in the performance of a face recognition pipeline that uses a ResNet backbone.
We designed 8 different occlusion masks which were applied to the input images.
In order to increase robustness under occlusions, we followed two approaches. The first is image inpainting using the pre-trained pluralistic image completion network.
The second is Cutmix, a regularization strategy consisting of mixing training images and their labels using rectangular patches.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we evaluate the effects of occlusions in the performance of a
face recognition pipeline that uses a ResNet backbone. The classifier was
trained on a subset of the CelebA-HQ dataset containing 5,478 images from 307
classes, to achieve top-1 error rate of 17.91%. We designed 8 different
occlusion masks which were applied to the input images. This caused a
significant drop in the classifier performance: its error rate for each mask
became at least two times worse than before. In order to increase robustness
under occlusions, we followed two approaches. The first is image inpainting
using the pre-trained pluralistic image completion network. The second is
Cutmix, a regularization strategy consisting of mixing training images and
their labels using rectangular patches, making the classifier more robust
against input corruptions. Both strategies revealed effective and interesting
results were observed. In particular, the Cutmix approach makes the network
more robust without requiring additional steps at the application time, though
its training time is considerably longer. Our datasets containing the different
occlusion masks as well as their inpainted counterparts are made publicly
available to promote research on the field.
Related papers
- Frequency-Guided Masking for Enhanced Vision Self-Supervised Learning [49.275450836604726]
We present a novel frequency-based Self-Supervised Learning (SSL) approach that significantly enhances its efficacy for pre-training.
We employ a two-branch framework empowered by knowledge distillation, enabling the model to take both the filtered and original images as input.
arXiv Detail & Related papers (2024-09-16T15:10:07Z) - Variance-insensitive and Target-preserving Mask Refinement for
Interactive Image Segmentation [68.16510297109872]
Point-based interactive image segmentation can ease the burden of mask annotation in applications such as semantic segmentation and image editing.
We introduce a novel method, Variance-Insensitive and Target-Preserving Mask Refinement to enhance segmentation quality with fewer user inputs.
Experiments on GrabCut, Berkeley, SBD, and DAVIS datasets demonstrate our method's state-of-the-art performance in interactive image segmentation.
arXiv Detail & Related papers (2023-12-22T02:31:31Z) - Masking Improves Contrastive Self-Supervised Learning for ConvNets, and Saliency Tells You Where [63.61248884015162]
We aim to alleviate the burden of including masking operation into the contrastive-learning framework for convolutional neural networks.
We propose to explicitly take the saliency constraint into consideration in which the masked regions are more evenly distributed among the foreground and background.
arXiv Detail & Related papers (2023-09-22T09:58:38Z) - Learning to Mask and Permute Visual Tokens for Vision Transformer
Pre-Training [59.923672191632065]
We propose a new self-supervised pre-training approach, named Masked and Permuted Vision Transformer (MaPeT)
MaPeT employs autoregressive and permuted predictions to capture intra-patch dependencies.
Our results demonstrate that MaPeT achieves competitive performance on ImageNet.
arXiv Detail & Related papers (2023-06-12T18:12:19Z) - Efficient Masked Autoencoders with Self-Consistency [34.7076436760695]
Masked image modeling (MIM) has been recognized as a strong self-supervised pre-training method in computer vision.
We propose efficient masked autoencoders with self-consistency (EMAE) to improve the pre-training efficiency.
EMAE consistently obtains state-of-the-art transfer ability on a variety of downstream tasks, such as image classification, object detection, and semantic segmentation.
arXiv Detail & Related papers (2023-02-28T09:21:12Z) - What You See is What You Classify: Black Box Attributions [61.998683569022006]
We train a deep network, the Explainer, to predict attributions for a pre-trained black-box classifier, the Explanandum.
Unlike most existing approaches, ours is capable of directly generating very distinct class-specific masks.
We show that our attributions are superior to established methods both visually and quantitatively.
arXiv Detail & Related papers (2022-05-23T12:30:04Z) - Mix-up Self-Supervised Learning for Contrast-agnostic Applications [33.807005669824136]
We present the first mix-up self-supervised learning framework for contrast-agnostic applications.
We address the low variance across images based on cross-domain mix-up and build the pretext task based on image reconstruction and transparency prediction.
arXiv Detail & Related papers (2022-04-02T16:58:36Z) - Observations on K-image Expansion of Image-Mixing Augmentation for
Classification [33.99556142456945]
This paper derives a new K-image mixing augmentation based on the stick-breaking process under Dirichlet prior.
We show that our method can train more robust and generalized classifiers through extensive experiments and analysis on classification accuracy, a shape of a loss landscape and adversarial robustness, than the usual two-image methods.
arXiv Detail & Related papers (2021-10-08T16:58:20Z) - Suppressing Uncertainties for Large-Scale Facial Expression Recognition [81.51495681011404]
This paper proposes a simple yet efficient Self-Cure Network (SCN) which suppresses the uncertainties efficiently and prevents deep networks from over-fitting uncertain facial images.
Results on public benchmarks demonstrate that our SCN outperforms current state-of-the-art methods with textbf88.14% on RAF-DB, textbf60.23% on AffectNet, and textbf89.35% on FERPlus.
arXiv Detail & Related papers (2020-02-24T17:24:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.