DeepFace-EMD: Re-ranking Using Patch-wise Earth Mover's Distance
Improves Out-Of-Distribution Face Identification
- URL: http://arxiv.org/abs/2112.04016v1
- Date: Tue, 7 Dec 2021 22:04:53 GMT
- Title: DeepFace-EMD: Re-ranking Using Patch-wise Earth Mover's Distance
Improves Out-Of-Distribution Face Identification
- Authors: Hai Phan, Anh Nguyen
- Abstract summary: Face identification (FI) is ubiquitous and drives many high-stake decisions made by law enforcement.
State-of-the-art FI approaches compare two images by taking the cosine similarity between their image embeddings.
Here, we propose a re-ranking approach that compares two faces using the Earth Mover's Distance on the deep, spatial features of image patches.
- Score: 19.20353547123292
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Face identification (FI) is ubiquitous and drives many high-stake decisions
made by law enforcement. State-of-the-art FI approaches compare two images by
taking the cosine similarity between their image embeddings. Yet, such an
approach suffers from poor out-of-distribution (OOD) generalization to new
types of images (e.g., when a query face is masked, cropped, or rotated) not
included in the training set or the gallery. Here, we propose a re-ranking
approach that compares two faces using the Earth Mover's Distance on the deep,
spatial features of image patches. Our extra comparison stage explicitly
examines image similarity at a fine-grained level (e.g., eyes to eyes) and is
more robust to OOD perturbations and occlusions than traditional FI.
Interestingly, without finetuning feature extractors, our method consistently
improves the accuracy on all tested OOD queries: masked, cropped, rotated, and
adversarial while obtaining similar results on in-distribution images.
Related papers
- OSDFace: One-Step Diffusion Model for Face Restoration [72.5045389847792]
Diffusion models have demonstrated impressive performance in face restoration.
We propose OSDFace, a novel one-step diffusion model for face restoration.
Results demonstrate that OSDFace surpasses current state-of-the-art (SOTA) methods in both visual quality and quantitative metrics.
arXiv Detail & Related papers (2024-11-26T07:07:48Z) - Detecting Near-Duplicate Face Images [11.270856740227327]
We construct a tree-like structure called an Image Phylogeny Tree (IPT) using a graph-theoretic approach to estimate the relationship.
We further extend our method to create an ensemble of IPTs known as Image Phylogeny Forests (IPFs)
arXiv Detail & Related papers (2024-08-14T17:45:13Z) - Distractors-Immune Representation Learning with Cross-modal Contrastive Regularization for Change Captioning [71.14084801851381]
Change captioning aims to succinctly describe the semantic change between a pair of similar images.
Most existing methods directly capture the difference between them, which risk obtaining error-prone difference features.
We propose a distractors-immune representation learning network that correlates the corresponding channels of two image representations.
arXiv Detail & Related papers (2024-07-16T13:00:33Z) - Breaking the Frame: Visual Place Recognition by Overlap Prediction [53.17564423756082]
We propose a novel visual place recognition approach based on overlap prediction, called VOP.
VOP proceeds co-visible image sections by obtaining patch-level embeddings using a Vision Transformer backbone.
Our approach uses a voting mechanism to assess overlap scores for potential database images.
arXiv Detail & Related papers (2024-06-23T20:00:20Z) - DeepFidelity: Perceptual Forgery Fidelity Assessment for Deepfake
Detection [67.3143177137102]
Deepfake detection refers to detecting artificially generated or edited faces in images or videos.
We propose a novel Deepfake detection framework named DeepFidelity to adaptively distinguish real and fake faces.
arXiv Detail & Related papers (2023-12-07T07:19:45Z) - Fast and Interpretable Face Identification for Out-Of-Distribution Data
Using Vision Transformers [5.987804054392297]
We propose a novel, 2-image Vision Transformers (ViTs) that compares two images at the patch level using cross-attention.
Our model performs at a comparable accuracy as DeepFace-EMD on out-of-distribution data, yet at an inference speed more than twice as fast as DeepFace-EMD.
arXiv Detail & Related papers (2023-11-06T00:11:24Z) - Efficient Explainable Face Verification based on Similarity Score
Argument Backpropagation [5.956239490189115]
Understanding why two faces images are matched or not matched by a given face recognition system is important.
We propose xSSAB, an approach to back-propagate similarity score-based arguments that support or oppose the face matching decision.
We present Patch-LFW, a new explainable face verification benchmark that enables along with a novel evaluation protocol.
arXiv Detail & Related papers (2023-04-26T09:48:48Z) - Deepfake Detection of Occluded Images Using a Patch-based Approach [1.6114012813668928]
We present a deep learning approach using the entire face and face patches to distinguish real/fake images in the presence of obstruction.
For producing fake images, StyleGAN and StyleGAN2 are trained by FFHQ images and also StarGAN and PGGAN are trained by CelebA images.
The proposed approach reaches higher results in early epochs than other methods and increases the SoTA results by 0.4%-7.9% in the different built data-sets.
arXiv Detail & Related papers (2023-04-10T12:12:14Z) - Manifold-Inspired Single Image Interpolation [17.304301226838614]
Many approaches to single image use manifold models to exploit semi-local similarity.
aliasing in the input image makes it challenging for both parts.
We propose a carefully-designed adaptive technique to remove aliasing in severely aliased regions.
This technique enables reliable identification of similar patches even in the presence of strong aliasing.
arXiv Detail & Related papers (2021-07-31T04:29:05Z) - DeepFake Detection Based on the Discrepancy Between the Face and its
Context [94.47879216590813]
We propose a method for detecting face swapping and other identity manipulations in single images.
Our approach involves two networks: (i) a face identification network that considers the face region bounded by a tight semantic segmentation, and (ii) a context recognition network that considers the face context.
We describe a method which uses the recognition signals from our two networks to detect such discrepancies.
Our method achieves state of the art results on the FaceForensics++, Celeb-DF-v2, and DFDC benchmarks for face manipulation detection, and even generalizes to detect fakes produced by unseen methods.
arXiv Detail & Related papers (2020-08-27T17:04:46Z) - Adversarial Semantic Data Augmentation for Human Pose Estimation [96.75411357541438]
We propose Semantic Data Augmentation (SDA), a method that augments images by pasting segmented body parts with various semantic granularity.
We also propose Adversarial Semantic Data Augmentation (ASDA), which exploits a generative network to dynamiclly predict tailored pasting configuration.
State-of-the-art results are achieved on challenging benchmarks.
arXiv Detail & Related papers (2020-08-03T07:56:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.