Fast and Interpretable Face Identification for Out-Of-Distribution Data
Using Vision Transformers
- URL: http://arxiv.org/abs/2311.02803v1
- Date: Mon, 6 Nov 2023 00:11:24 GMT
- Title: Fast and Interpretable Face Identification for Out-Of-Distribution Data
Using Vision Transformers
- Authors: Hai Phan, Cindy Le, Vu Le, Yihui He, Anh Totti Nguyen
- Abstract summary: We propose a novel, 2-image Vision Transformers (ViTs) that compares two images at the patch level using cross-attention.
Our model performs at a comparable accuracy as DeepFace-EMD on out-of-distribution data, yet at an inference speed more than twice as fast as DeepFace-EMD.
- Score: 5.987804054392297
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Most face identification approaches employ a Siamese neural network to
compare two images at the image embedding level. Yet, this technique can be
subject to occlusion (e.g. faces with masks or sunglasses) and
out-of-distribution data. DeepFace-EMD (Phan et al. 2022) reaches
state-of-the-art accuracy on out-of-distribution data by first comparing two
images at the image level, and then at the patch level. Yet, its later
patch-wise re-ranking stage admits a large $O(n^3 \log n)$ time complexity (for
$n$ patches in an image) due to the optimal transport optimization. In this
paper, we propose a novel, 2-image Vision Transformers (ViTs) that compares two
images at the patch level using cross-attention. After training on 2M pairs of
images on CASIA Webface (Yi et al. 2014), our model performs at a comparable
accuracy as DeepFace-EMD on out-of-distribution data, yet at an inference speed
more than twice as fast as DeepFace-EMD (Phan et al. 2022). In addition, via a
human study, our model shows promising explainability through the visualization
of cross-attention. We believe our work can inspire more explorations in using
ViTs for face identification.
Related papers
- OSDFace: One-Step Diffusion Model for Face Restoration [72.5045389847792]
Diffusion models have demonstrated impressive performance in face restoration.
We propose OSDFace, a novel one-step diffusion model for face restoration.
Results demonstrate that OSDFace surpasses current state-of-the-art (SOTA) methods in both visual quality and quantitative metrics.
arXiv Detail & Related papers (2024-11-26T07:07:48Z) - Vec2Face: Scaling Face Dataset Generation with Loosely Constrained Vectors [19.02273216268032]
Vec2Face is a holistic model that uses only a sampled vector as input.
Vec2Face is supervised by face image reconstruction and can be conveniently used in inference.
Vec2Face has efficiently synthesized as many as 300K identities with 15 million total images.
arXiv Detail & Related papers (2024-09-04T17:59:51Z) - GSmoothFace: Generalized Smooth Talking Face Generation via Fine Grained
3D Face Guidance [83.43852715997596]
GSmoothFace is a novel two-stage generalized talking face generation model guided by a fine-grained 3d face model.
It can synthesize smooth lip dynamics while preserving the speaker's identity.
Both quantitative and qualitative experiments confirm the superiority of our method in terms of realism, lip synchronization, and visual quality.
arXiv Detail & Related papers (2023-12-12T16:00:55Z) - Generating 2D and 3D Master Faces for Dictionary Attacks with a
Network-Assisted Latent Space Evolution [68.8204255655161]
A master face is a face image that passes face-based identity authentication for a high percentage of the population.
We optimize these faces for 2D and 3D face verification models.
In 3D, we generate faces using the 2D StyleGAN2 generator and predict a 3D structure using a deep 3D face reconstruction network.
arXiv Detail & Related papers (2022-11-25T09:15:38Z) - Multiface: A Dataset for Neural Face Rendering [108.44505415073579]
In this work, we present Multiface, a new multi-view, high-resolution human face dataset.
We introduce Mugsy, a large scale multi-camera apparatus to capture high-resolution synchronized videos of a facial performance.
The goal of Multiface is to close the gap in accessibility to high quality data in the academic community and to enable research in VR telepresence.
arXiv Detail & Related papers (2022-07-22T17:55:39Z) - DeepFace-EMD: Re-ranking Using Patch-wise Earth Mover's Distance
Improves Out-Of-Distribution Face Identification [19.20353547123292]
Face identification (FI) is ubiquitous and drives many high-stake decisions made by law enforcement.
State-of-the-art FI approaches compare two images by taking the cosine similarity between their image embeddings.
Here, we propose a re-ranking approach that compares two faces using the Earth Mover's Distance on the deep, spatial features of image patches.
arXiv Detail & Related papers (2021-12-07T22:04:53Z) - FaceTuneGAN: Face Autoencoder for Convolutional Expression Transfer
Using Neural Generative Adversarial Networks [0.7043489166804575]
We present FaceTuneGAN, a new 3D face model representation decomposing and encoding separately facial identity and facial expression.
We propose a first adaptation of image-to-image translation networks, that have successfully been used in the 2D domain, to 3D face geometry.
arXiv Detail & Related papers (2021-12-01T14:42:03Z) - One Shot Face Swapping on Megapixels [65.47443090320955]
This paper proposes the first Megapixel level method for one shot Face Swapping (or MegaFS for short)
Complete face representation, stable training, and limited memory usage are the three novel contributions to the success of our method.
arXiv Detail & Related papers (2021-05-11T10:41:47Z) - Facial Masks and Soft-Biometrics: Leveraging Face Recognition CNNs for
Age and Gender Prediction on Mobile Ocular Images [53.913598771836924]
We address the use of selfie ocular images captured with smartphones to estimate age and gender.
We adapt two existing lightweight CNNs proposed in the context of the ImageNet Challenge.
Some networks are further pre-trained for face recognition, for which very large training databases are available.
arXiv Detail & Related papers (2021-03-31T01:48:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.