Can Giraffes Become Birds? An Evaluation of Image-to-image Translation
for Data Generation
- URL: http://arxiv.org/abs/2001.03637v2
- Date: Sun, 31 May 2020 03:25:39 GMT
- Title: Can Giraffes Become Birds? An Evaluation of Image-to-image Translation
for Data Generation
- Authors: Daniel V. Ruiz, Gabriel Salomon, Eduardo Todt
- Abstract summary: We investigate image-to-image translation using Generative Adrial Networks (GANs) for generating new data.
An unsupervised cross-domain translator entitled InstaGAN was trained on giraffes and birds, along with their respective masks, to learn translation between both domains.
A dataset of synthetic bird images was generated using translation from originally giraffe images while preserving the original spatial arrangement and background.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: There is an increasing interest in image-to-image translation with
applications ranging from generating maps from satellite images to creating
entire clothes' images from only contours. In the present work, we investigate
image-to-image translation using Generative Adversarial Networks (GANs) for
generating new data, taking as a case study the morphing of giraffes images
into bird images. Morphing a giraffe into a bird is a challenging task, as they
have different scales, textures, and morphology. An unsupervised cross-domain
translator entitled InstaGAN was trained on giraffes and birds, along with
their respective masks, to learn translation between both domains. A dataset of
synthetic bird images was generated using translation from originally giraffe
images while preserving the original spatial arrangement and background. It is
important to stress that the generated birds do not exist, being only the
result of a latent representation learned by InstaGAN. Two subsets of common
literature datasets were used for training the GAN and generating the
translated images: COCO and Caltech-UCSD Birds 200-2011. To evaluate the
realness and quality of the generated images and masks, qualitative and
quantitative analyses were made. For the quantitative analysis, a pre-trained
Mask R-CNN was used for the detection and segmentation of birds on Pascal VOC,
Caltech-UCSD Birds 200-2011, and our new dataset entitled FakeSet. The
generated dataset achieved detection and segmentation results close to the real
datasets, suggesting that the generated images are realistic enough to be
detected and segmented by a state-of-the-art deep neural network.
Related papers
- Additional Look into GAN-based Augmentation for Deep Learning COVID-19
Image Classification [57.1795052451257]
We study the dependence of the GAN-based augmentation performance on dataset size with a focus on small samples.
We train StyleGAN2-ADA with both sets and then, after validating the quality of generated images, we use trained GANs as one of the augmentations approaches in multi-class classification problems.
The GAN-based augmentation approach is found to be comparable with classical augmentation in the case of medium and large datasets but underperforms in the case of smaller datasets.
arXiv Detail & Related papers (2024-01-26T08:28:13Z) - BirdSAT: Cross-View Contrastive Masked Autoencoders for Bird Species
Classification and Mapping [22.30038765017189]
We propose a metadata-aware self-supervised learning(SSL) framework useful for fine-grained classification and ecological mapping of bird species around the world.
Our framework unifies two SSL strategies: Contrastive Learning(CL) and Masked Image Modeling(MIM), while also enriching the embedding space with metadata available with ground-level imagery of birds.
We demonstrate that our models learn fine-grained and geographically conditioned features of birds, by evaluating on two downstream tasks: fine-grained visual classification(FGVC) and cross-modal retrieval.
arXiv Detail & Related papers (2023-10-29T22:08:00Z) - Parents and Children: Distinguishing Multimodal DeepFakes from Natural Images [60.34381768479834]
Recent advancements in diffusion models have enabled the generation of realistic deepfakes from textual prompts in natural language.
We pioneer a systematic study on deepfake detection generated by state-of-the-art diffusion models.
arXiv Detail & Related papers (2023-04-02T10:25:09Z) - Extracting Semantic Knowledge from GANs with Unsupervised Learning [65.32631025780631]
Generative Adversarial Networks (GANs) encode semantics in feature maps in a linearly separable form.
We propose a novel clustering algorithm, named KLiSH, which leverages the linear separability to cluster GAN's features.
KLiSH succeeds in extracting fine-grained semantics of GANs trained on datasets of various objects.
arXiv Detail & Related papers (2022-11-30T03:18:16Z) - InvGAN: Invertible GANs [88.58338626299837]
InvGAN, short for Invertible GAN, successfully embeds real images to the latent space of a high quality generative model.
This allows us to perform image inpainting, merging, and online data augmentation.
arXiv Detail & Related papers (2021-12-08T21:39:00Z) - Learning Co-segmentation by Segment Swapping for Retrieval and Discovery [67.6609943904996]
The goal of this work is to efficiently identify visually similar patterns from a pair of images.
We generate synthetic training pairs by selecting object segments in an image and copy-pasting them into another image.
We show our approach provides clear improvements for artwork details retrieval on the Brueghel dataset.
arXiv Detail & Related papers (2021-10-29T16:51:16Z) - Towards Fine-grained Image Classification with Generative Adversarial
Networks and Facial Landmark Detection [0.0]
We use GAN-based data augmentation to generate extra dataset instances.
We validated our work by evaluating the accuracy of fine-grained image classification on the recent Vision Transformer (ViT) Model.
arXiv Detail & Related papers (2021-08-28T06:32:42Z) - Ensembling with Deep Generative Views [72.70801582346344]
generative models can synthesize "views" of artificial images that mimic real-world variations, such as changes in color or pose.
Here, we investigate whether such views can be applied to real images to benefit downstream analysis tasks such as image classification.
We use StyleGAN2 as the source of generative augmentations and investigate this setup on classification tasks involving facial attributes, cat faces, and cars.
arXiv Detail & Related papers (2021-04-29T17:58:35Z) - Domain Adaptation with Morphologic Segmentation [8.0698976170854]
We present a novel domain adaptation framework that uses morphologic segmentation to translate images from arbitrary input domains (real and synthetic) into a uniform output domain.
Our goal is to establish a preprocessing step that unifies data from multiple sources into a common representation.
We showcase the effectiveness of our approach by qualitatively and quantitatively evaluating our method on four data sets of simulated and real data of urban scenes.
arXiv Detail & Related papers (2020-06-16T17:06:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.