CleftGAN: Adapting A Style-Based Generative Adversarial Network To
Create Images Depicting Cleft Lip Deformity
- URL: http://arxiv.org/abs/2310.07969v1
- Date: Thu, 12 Oct 2023 01:25:21 GMT
- Title: CleftGAN: Adapting A Style-Based Generative Adversarial Network To
Create Images Depicting Cleft Lip Deformity
- Authors: Abdullah Hayajneh and Erchin Serpedin and Mohammad Shaqfeh and Graeme
Glass and Mitchell A. Stotland
- Abstract summary: We have built a deep learning-based cleft lip generator designed to produce an almost unlimited number of artificial images exhibiting high-fidelity facsimiles of cleft lip.
We undertook a transfer learning protocol testing different versions of StyleGAN-ADA.
Training images depicting a variety of cleft deformities were pre-processed to adjust for rotation, scaling, color adjustment and background blurring.
- Score: 2.1647227058902336
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: A major obstacle when attempting to train a machine learning system to
evaluate facial clefts is the scarcity of large datasets of high-quality,
ethics board-approved patient images. In response, we have built a deep
learning-based cleft lip generator designed to produce an almost unlimited
number of artificial images exhibiting high-fidelity facsimiles of cleft lip
with wide variation. We undertook a transfer learning protocol testing
different versions of StyleGAN-ADA (a generative adversarial network image
generator incorporating adaptive data augmentation (ADA)) as the base model.
Training images depicting a variety of cleft deformities were pre-processed to
adjust for rotation, scaling, color adjustment and background blurring. The ADA
modification of the primary algorithm permitted construction of our new
generative model while requiring input of a relatively small number of training
images. Adversarial training was carried out using 514 unique frontal
photographs of cleft-affected faces to adapt a pre-trained model based on
70,000 normal faces. The Frechet Inception Distance (FID) was used to measure
the similarity of the newly generated facial images to the cleft training
dataset, while Perceptual Path Length (PPL) and the novel Divergence Index of
Severity Histograms (DISH) measures were also used to assess the performance of
the image generator that we dub CleftGAN. We found that StyleGAN3 with
translation invariance (StyleGAN3-t) performed optimally as a base model.
Generated images achieved a low FID reflecting a close similarity to our
training input dataset of genuine cleft images. Low PPL and DISH measures
reflected a smooth and semantically valid interpolation of images through the
transfer learning process and a similar distribution of severity in the
training and generated images, respectively.
Related papers
- Adversarial Masking Contrastive Learning for vein recognition [10.886119051977785]
Vein recognition has received increasing attention due to its high security and privacy.
Deep neural networks such as Convolutional neural networks (CNN) and Transformers have been introduced for vein recognition.
Despite the recent advances, existing solutions for finger-vein feature extraction are still not optimal due to scarce training image samples.
arXiv Detail & Related papers (2024-01-16T03:09:45Z) - Disruptive Autoencoders: Leveraging Low-level features for 3D Medical
Image Pre-training [51.16994853817024]
This work focuses on designing an effective pre-training framework for 3D radiology images.
We introduce Disruptive Autoencoders, a pre-training framework that attempts to reconstruct the original image from disruptions created by a combination of local masking and low-level perturbations.
The proposed pre-training framework is tested across multiple downstream tasks and achieves state-of-the-art performance.
arXiv Detail & Related papers (2023-07-31T17:59:42Z) - Traditional Classification Neural Networks are Good Generators: They are
Competitive with DDPMs and GANs [104.72108627191041]
We show that conventional neural network classifiers can generate high-quality images comparable to state-of-the-art generative models.
We propose a mask-based reconstruction module to make semantic gradients-aware to synthesize plausible images.
We show that our method is also applicable to text-to-image generation by regarding image-text foundation models.
arXiv Detail & Related papers (2022-11-27T11:25:35Z) - Two-Stream Graph Convolutional Network for Intra-oral Scanner Image
Segmentation [133.02190910009384]
We propose a two-stream graph convolutional network (i.e., TSGCN) to handle inter-view confusion between different raw attributes.
Our TSGCN significantly outperforms state-of-the-art methods in 3D tooth (surface) segmentation.
arXiv Detail & Related papers (2022-04-19T10:41:09Z) - Meta Internal Learning [88.68276505511922]
Internal learning for single-image generation is a framework, where a generator is trained to produce novel images based on a single image.
We propose a meta-learning approach that enables training over a collection of images, in order to model the internal statistics of the sample image more effectively.
Our results show that the models obtained are as suitable as single-image GANs for many common image applications.
arXiv Detail & Related papers (2021-10-06T16:27:38Z) - Ensembling with Deep Generative Views [72.70801582346344]
generative models can synthesize "views" of artificial images that mimic real-world variations, such as changes in color or pose.
Here, we investigate whether such views can be applied to real images to benefit downstream analysis tasks such as image classification.
We use StyleGAN2 as the source of generative augmentations and investigate this setup on classification tasks involving facial attributes, cat faces, and cars.
arXiv Detail & Related papers (2021-04-29T17:58:35Z) - Pose Manipulation with Identity Preservation [0.0]
We introduce Character Adaptive Identity Normalization GAN (CainGAN) which uses spatial characteristic features extracted by an embedder and combined across source images.
CainGAN receives figures of faces from a certain individual and produces new ones while preserving the person's identity.
Experimental results show that the quality of generated images scales with the size of the input set used during inference.
arXiv Detail & Related papers (2020-04-20T09:51:31Z) - Training End-to-end Single Image Generators without GANs [27.393821783237186]
AugurOne is a novel approach for training single image generative models.
Our approach trains an upscaling neural network using non-affine augmentations of the (single) input image.
A compact latent space is jointly learned allowing for controlled image synthesis.
arXiv Detail & Related papers (2020-04-07T17:58:03Z) - RADIOGAN: Deep Convolutional Conditional Generative adversarial Network
To Generate PET Images [3.947298454012977]
We propose a deep convolutional conditional generative adversarial network to generate MIP positron emission tomography image (PET)
The advantage of our proposed method consists of one model that is capable of generating different classes of lesions trained on a small sample size for each class of lesion.
In addition, we show that a walk through a latent space can be used as a tool to evaluate the images generated.
arXiv Detail & Related papers (2020-03-19T10:14:40Z) - Joint Deep Learning of Facial Expression Synthesis and Recognition [97.19528464266824]
We propose a novel joint deep learning of facial expression synthesis and recognition method for effective FER.
The proposed method involves a two-stage learning procedure. Firstly, a facial expression synthesis generative adversarial network (FESGAN) is pre-trained to generate facial images with different facial expressions.
In order to alleviate the problem of data bias between the real images and the synthetic images, we propose an intra-class loss with a novel real data-guided back-propagation (RDBP) algorithm.
arXiv Detail & Related papers (2020-02-06T10:56:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.