Ear2Face: Deep Biometric Modality Mapping
- URL: http://arxiv.org/abs/2006.01943v1
- Date: Tue, 2 Jun 2020 21:14:27 GMT
- Title: Ear2Face: Deep Biometric Modality Mapping
- Authors: Dogucan Yaman, Fevziye Irem Eyiokur, Haz{\i}m Kemal Ekenel
- Abstract summary: We present an end-to-end deep neural network model that learns a mapping between the biometric modalities.
We formulated the problem as a paired image-to-image translation task and collected datasets of ear and face image pairs.
We have achieved very promising results, especially on the FERET dataset, generating visually appealing face images from ear image inputs.
- Score: 9.560980936110234
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we explore the correlation between different visual biometric
modalities. For this purpose, we present an end-to-end deep neural network
model that learns a mapping between the biometric modalities. Namely, our goal
is to generate a frontal face image of a subject given his/her ear image as the
input. We formulated the problem as a paired image-to-image translation task
and collected datasets of ear and face image pairs from the Multi-PIE and FERET
datasets to train our GAN-based models. We employed feature reconstruction and
style reconstruction losses in addition to adversarial and pixel losses. We
evaluated the proposed method both in terms of reconstruction quality and in
terms of person identification accuracy. To assess the generalization
capability of the learned mapping models, we also run cross-dataset
experiments. That is, we trained the model on the FERET dataset and tested it
on the Multi-PIE dataset and vice versa. We have achieved very promising
results, especially on the FERET dataset, generating visually appealing face
images from ear image inputs. Moreover, we attained a very high cross-modality
person identification performance, for example, reaching 90.9% Rank-10
identification accuracy on the FERET dataset.
Related papers
- Exploring a Multimodal Fusion-based Deep Learning Network for Detecting Facial Palsy [3.2381492754749632]
We present a multimodal fusion-based deep learning model that utilizes unstructured data and structured data to detect facial palsy.
Our model slightly improved the precision score to 77.05 at the expense of a decrease in the recall score.
arXiv Detail & Related papers (2024-05-26T09:16:34Z) - Deep Domain Adaptation: A Sim2Real Neural Approach for Improving Eye-Tracking Systems [80.62854148838359]
Eye image segmentation is a critical step in eye tracking that has great influence over the final gaze estimate.
We use dimensionality-reduction techniques to measure the overlap between the target eye images and synthetic training data.
Our methods result in robust, improved performance when tackling the discrepancy between simulation and real-world data samples.
arXiv Detail & Related papers (2024-03-23T22:32:06Z) - MIMIC: Mask Image Pre-training with Mix Contrastive Fine-tuning for
Facial Expression Recognition [11.820043444385432]
We introduce a novel FER training paradigm named Mask Image pre-training with MIx Contrastive fine-tuning (MIMIC)
In the initial phase, we pre-train the ViT via masked image reconstruction on general images.
In the fine-tuning stage, we introduce a mix-supervised contrastive learning process, which enhances the model with a more extensive range of positive samples.
arXiv Detail & Related papers (2024-01-14T10:30:32Z) - Attribute-preserving Face Dataset Anonymization via Latent Code
Optimization [64.4569739006591]
We present a task-agnostic anonymization procedure that directly optimize the images' latent representation in the latent space of a pre-trained GAN.
We demonstrate through a series of experiments that our method is capable of anonymizing the identity of the images whilst -- crucially -- better-preserving the facial attributes.
arXiv Detail & Related papers (2023-03-20T17:34:05Z) - Learning Co-segmentation by Segment Swapping for Retrieval and Discovery [67.6609943904996]
The goal of this work is to efficiently identify visually similar patterns from a pair of images.
We generate synthetic training pairs by selecting object segments in an image and copy-pasting them into another image.
We show our approach provides clear improvements for artwork details retrieval on the Brueghel dataset.
arXiv Detail & Related papers (2021-10-29T16:51:16Z) - The FaceChannel: A Fast & Furious Deep Neural Network for Facial
Expression Recognition [71.24825724518847]
Current state-of-the-art models for automatic Facial Expression Recognition (FER) are based on very deep neural networks that are effective but rather expensive to train.
We formalize the FaceChannel, a light-weight neural network that has much fewer parameters than common deep neural networks.
We demonstrate how our model achieves a comparable, if not better, performance to the current state-of-the-art in FER.
arXiv Detail & Related papers (2020-09-15T09:25:37Z) - Pathological Retinal Region Segmentation From OCT Images Using Geometric
Relation Based Augmentation [84.7571086566595]
We propose improvements over previous GAN-based medical image synthesis methods by jointly encoding the intrinsic relationship of geometry and shape.
The proposed method outperforms state-of-the-art segmentation methods on the public RETOUCH dataset having images captured from different acquisition procedures.
arXiv Detail & Related papers (2020-03-31T11:50:43Z) - Deep Multi-Facial Patches Aggregation Network For Facial Expression
Recognition [5.735035463793008]
We propose an approach for Facial Expressions Recognition (FER) based on a deep multi-facial patches aggregation network.
Deep features are learned from facial patches using deep sub-networks and aggregated within one deep architecture for expression classification.
arXiv Detail & Related papers (2020-02-20T17:57:06Z) - Joint Deep Learning of Facial Expression Synthesis and Recognition [97.19528464266824]
We propose a novel joint deep learning of facial expression synthesis and recognition method for effective FER.
The proposed method involves a two-stage learning procedure. Firstly, a facial expression synthesis generative adversarial network (FESGAN) is pre-trained to generate facial images with different facial expressions.
In order to alleviate the problem of data bias between the real images and the synthetic images, we propose an intra-class loss with a novel real data-guided back-propagation (RDBP) algorithm.
arXiv Detail & Related papers (2020-02-06T10:56:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.