Vista-Morph: Unsupervised Image Registration of Visible-Thermal Facial
Pairs
- URL: http://arxiv.org/abs/2306.06505v1
- Date: Sat, 10 Jun 2023 18:42:36 GMT
- Title: Vista-Morph: Unsupervised Image Registration of Visible-Thermal Facial
Pairs
- Authors: Catherine Ordun, Edward Raff, Sanjay Purushotham
- Abstract summary: We introduce our approach for Visible-Thermal (VT) image registration called Vista Morph.
By learning the affine matrix through a Vision Transformer (ViT)-based Spatial Transformer Network (STN) and Generative Adversarial Networks (GAN), Vista Morph successfully aligns facial and non-facial VT images.
We conduct a downstream generative AI task to show that registering training data with Vista Morph improves subject identity of generated thermal faces when performing V2T image translation.
- Score: 36.33347149799959
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: For a variety of biometric cross-spectral tasks, Visible-Thermal (VT) facial
pairs are used. However, due to a lack of calibration in the lab, photographic
capture between two different sensors leads to severely misaligned pairs that
can lead to poor results for person re-identification and generative AI. To
solve this problem, we introduce our approach for VT image registration called
Vista Morph. Unlike existing VT facial registration that requires manual,
hand-crafted features for pixel matching and/or a supervised thermal reference,
Vista Morph is completely unsupervised without the need for a reference. By
learning the affine matrix through a Vision Transformer (ViT)-based Spatial
Transformer Network (STN) and Generative Adversarial Networks (GAN), Vista
Morph successfully aligns facial and non-facial VT images. Our approach learns
warps in Hard, No, and Low-light visual settings and is robust to geometric
perturbations and erasure at test time. We conduct a downstream generative AI
task to show that registering training data with Vista Morph improves subject
identity of generated thermal faces when performing V2T image translation.
Related papers
- Visualizing the loss landscape of Self-supervised Vision Transformer [53.84372035496475]
The Masked autoencoder (MAE) has drawn attention as a representative self-supervised approach for masked image modeling with vision transformers.
We visualize the loss landscapes of the self-supervised vision transformer by both MAE and RC-MAE and compare them with the supervised ViT (Sup-ViT)
To the best of our knowledge, this work is the first to investigate the self-supervised ViT through the lens of the loss landscape.
arXiv Detail & Related papers (2024-05-28T10:54:26Z) - CleftGAN: Adapting A Style-Based Generative Adversarial Network To
Create Images Depicting Cleft Lip Deformity [2.1647227058902336]
We have built a deep learning-based cleft lip generator designed to produce an almost unlimited number of artificial images exhibiting high-fidelity facsimiles of cleft lip.
We undertook a transfer learning protocol testing different versions of StyleGAN-ADA.
Training images depicting a variety of cleft deformities were pre-processed to adjust for rotation, scaling, color adjustment and background blurring.
arXiv Detail & Related papers (2023-10-12T01:25:21Z) - A Generative Approach for Image Registration of Visible-Thermal (VT)
Cancer Faces [77.77475333490744]
We modernize the classic computer vision task of image registration by applying and modifying a generative alignment algorithm.
We demonstrate that the quality of thermal images produced in the generative AI downstream task of Visible-to-Thermal (V2T) image translation significantly improves up to 52.5%.
arXiv Detail & Related papers (2023-08-23T17:39:58Z) - MorphGANFormer: Transformer-based Face Morphing and De-Morphing [55.211984079735196]
StyleGAN-based approaches to face morphing are among the leading techniques.
We propose a transformer-based alternative to face morphing and demonstrate its superiority to StyleGAN-based methods.
arXiv Detail & Related papers (2023-02-18T19:09:11Z) - Face Reconstruction with Variational Autoencoder and Face Masks [0.0]
In this work, we investigated how face masks can help the training of VAEs for face reconstruction.
An evaluation of the proposal using the celebA dataset shows that the reconstructed images are enhanced with the face masks.
arXiv Detail & Related papers (2021-12-03T19:49:52Z) - Adversarially Perturbed Wavelet-based Morphed Face Generation [16.98806338782858]
Morphed images can fool Facial Recognition Systems into falsely accepting multiple people.
As morphed image synthesis becomes easier, it is vital to expand the research community's available data.
We leverage both methods to generate high-quality adversarially perturbed from the FERET, FRGC, and FRLL datasets.
arXiv Detail & Related papers (2021-11-03T01:18:29Z) - Intriguing Properties of Vision Transformers [114.28522466830374]
Vision transformers (ViT) have demonstrated impressive performance across various machine vision problems.
We systematically study this question via an extensive set of experiments and comparisons with a high-performing convolutional neural network (CNN)
We show effective features of ViTs are due to flexible receptive and dynamic fields possible via the self-attention mechanism.
arXiv Detail & Related papers (2021-05-21T17:59:18Z) - MorphGAN: One-Shot Face Synthesis GAN for Detecting Recognition Bias [13.162012586770576]
We describe a simulator that applies specific head pose and facial expression adjustments to images of previously unseen people.
We show that by augmenting small datasets of faces with new poses and expressions improves the recognition performance by up to 9% depending on the augmentation and data scarcity.
arXiv Detail & Related papers (2020-12-09T18:43:03Z) - Fine-grained Image-to-Image Transformation towards Visual Recognition [102.51124181873101]
We aim at transforming an image with a fine-grained category to synthesize new images that preserve the identity of the input image.
We adopt a model based on generative adversarial networks to disentangle the identity related and unrelated factors of an image.
Experiments on the CompCars and Multi-PIE datasets demonstrate that our model preserves the identity of the generated images much better than the state-of-the-art image-to-image transformation models.
arXiv Detail & Related papers (2020-01-12T05:26:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.